SXEmacs Internals Manual: Encodings

18.2 Encodings

An encoding is a way of numerically representing characters from one or more character sets. If an encoding only encompasses one character set, then the position codes for the characters in that character set could be used directly. This is not possible, however, if more than one character set is to be used in the encoding.

For example, the conversion detailed above between bytes in a binary file and characters is effectively an encoding that encompasses the three character sets ASCII, Control-1, and Latin-1 in a stream of 8-bit bytes.

Thus, an encoding can be viewed as a way of encoding characters from a specified group of character sets using a stream of bytes, each of which contains a fixed number of bits (but not necessarily 8, as in the common usage of “byte”).

Here are descriptions of a couple of common encodings: