Plain ASCII is actually seven bits. "Extended" ASCII variants are eight bits. Its implementation defined if a char
is signed or unsigned. And multi-byte encodings (like e.g. UTF-8) can't fit in an eight-bit type. If you want to "compress" bytes, don't compress "characters", compress a stream of unknown bytes instead. And if you want to compress raw data, don't open files in text-mode, that will mess things up on Windows. Also, I recommend you use a vector of std::uint8_t
elements instead of a string. With std::uint8_t
you won't have your negative value problem to begin with. The lesson to be learned: If you want to only deal with unsigned values, use unsigned types. Credits to: @Someprogrammerdude for answering this question in the comment section!