Previous: Big5 and Shift-JIS Functions, Up: Coding Systems [Contents][Index]
MULE initializes most of the commonly used coding systems at SXEmacs’s startup. A few others are initialized only when the relevant language environment is selected and support libraries are loaded. (NB: The following list is based on XEmacs 21.2.19, the development branch at the time of writing. The list may be somewhat different for other versions. Recent versions of GNU Emacs 20 implement a few more rare coding systems; work is being done to port these to SXEmacs.)
Unfortunately, there is not a consistent naming convention for character sets, and for practical purposes coding systems often take their name from their principal character sets (ASCII, KOI8-R, Shift JIS). Others take their names from the coding system (ISO-2022-JP, EUC-KR), and a few from their non-text usages (internal, binary). To provide for this, and for the fact that many coding systems have several common names, an aliasing system is provided. Finally, some effort has been made to use names that are registered as MIME charsets (this is why the name ’shift_jis contains that un-Lisp-y underscore).
There is a systematic naming convention regarding end-of-line (EOL) conventions for different systems. A coding system whose name ends in "-unix" forces the assumptions that lines are broken by newlines (0x0A). A coding system whose name ends in "-mac" forces the assumptions that lines are broken by ASCII CRs (0x0D). A coding system whose name ends in "-dos" forces the assumptions that lines are broken by CRLF sequences (0x0D 0x0A). These subsidiary coding systems are automatically derived from a base coding system. Use of the base coding system implies autodetection of the text file convention. (The fact that the -unix, -mac, and -dos are derived from a base system results in them showing up as "aliases" in ‘list-coding-systems’.) These subsidiaries have a consistent modeline indicator as well. "-dos" coding systems have ":T" appended to their modeline indicator, while "-mac" coding systems have ":t" appended (eg, "ISO8:t" for iso-2022-8-mac).
In the following table, each coding system is given with its mode line indicator in parentheses. Non-textual coding systems are listed first, followed by textual coding systems and their aliases. (The coding system subsidiary modeline indicators ":T" and ":t" will be omitted from the table of coding systems.)
### SJT 1999-08-23 Maybe should order these by language? Definitely need language usage for the ISO-8859 family.
Note that although true coding system aliases have been implemented for XEmacs 21.2, the coding system initialization has not yet been converted as of 21.2.19. So coding systems described as aliases have the same properties as the aliased coding system, but will not be equal as Lisp objects.
automatic-conversionundecidedundecided-dosundecided-macundecided-unixModeline indicator: Auto. A type undecided coding system.
Attempts to determine an appropriate coding system from file contents or
the environment.
raw-textno-conversionraw-text-dosraw-text-macraw-text-unixno-conversion-dosno-conversion-macno-conversion-unixModeline indicator: Raw. A type no-conversion coding system,
which converts only line-break-codes. An implementation quirk means
that this coding system is also used for ISO8859-1.
binaryModeline indicator: Binary. A type no-conversion coding
system which does no character coding or EOL conversions. An alias for
raw-text-unix.
alternativnyjalternativnyj-dosalternativnyj-macalternativnyj-unixModeline indicator: Cy.Alt. A type ccl coding system used for
Alternativnyj, an encoding of the Cyrillic alphabet.
big5big5-dosbig5-macbig5-unixModeline indicator: Zh/Big5. A type big5 coding system used for
BIG5, the most common encoding of traditional Chinese as used in Taiwan.
cn-gb-2312cn-gb-2312-doscn-gb-2312-maccn-gb-2312-unixModeline indicator: Zh-GB/EUC. A type iso2022 coding system used
for simplified Chinese (as used in the People’s Republic of China), with
the ascii (G0), chinese-gb2312 (G1), and sisheng
(G2) character sets initially designated. Chinese EUC (Extended Unix
Code).
ctext-hebrewctext-hebrew-dosctext-hebrew-macctext-hebrew-unixModeline indicator: CText/Hbrw. A type iso2022 coding system
with the ascii (G0) and hebrew-iso8859-8 (G1) character
sets initially designated for Hebrew.
ctextctext-dosctext-macctext-unixModeline indicator: CText. A type iso2022 8-bit coding system
with the ascii (G0) and latin-iso8859-1 (G1) character
sets initially designated. X11 Compound Text Encoding. Often
mistakenly recognized instead of EUC encodings; usual cause is
inappropriate setting of coding-priority-list.
escape-quotedModeline indicator: ESC/Quot. A type iso2022 8-bit coding
system with the ascii (G0) and latin-iso8859-1 (G1)
character sets initially designated and escape quoting. Unix EOL
conversion (ie, no conversion). It is used for .ELC files.
euc-jpeuc-jp-doseuc-jp-maceuc-jp-unixModeline indicator: Ja/EUC. A type iso2022 8-bit coding system
with ascii (G0), japanese-jisx0208 (G1),
katakana-jisx0201 (G2), and japanese-jisx0212 (G3)
initially designated. Japanese EUC (Extended Unix Code).
euc-kreuc-kr-doseuc-kr-maceuc-kr-unixModeline indicator: ko/EUC. A type iso2022 8-bit coding system
with ascii (G0) and korean-ksc5601 (G1) initially
designated. Korean EUC (Extended Unix Code).
hz-gb-2312Modeline indicator: Zh-GB/Hz. A type no-conversion coding
system with Unix EOL convention (ie, no conversion) using
post-read-decode and pre-write-encode functions to translate the Hz/ZW
coding system used for Chinese.
iso-2022-7bitiso-2022-7bit-unixiso-2022-7bit-dosiso-2022-7bit-maciso-2022-7Modeline indicator: ISO7. A type iso2022 7-bit coding system
with ascii (G0) initially designated. Other character sets must
be explicitly designated to be used.
iso-2022-7bit-ss2iso-2022-7bit-ss2-dosiso-2022-7bit-ss2-maciso-2022-7bit-ss2-unixModeline indicator: ISO7/SS. A type iso2022 7-bit coding system
with ascii (G0) initially designated. Other character sets must
be explicitly designated to be used. SS2 is used to invoke a
96-charset, one character at a time.
iso-2022-8iso-2022-8-dosiso-2022-8-maciso-2022-8-unixModeline indicator: ISO8. A type iso2022 8-bit coding system
with ascii (G0) and latin-iso8859-1 (G1) initially
designated. Other character sets must be explicitly designated to be
used. No single-shift or locking-shift.
iso-2022-8bit-ss2iso-2022-8bit-ss2-dosiso-2022-8bit-ss2-maciso-2022-8bit-ss2-unixModeline indicator: ISO8/SS. A type iso2022 8-bit coding system
with ascii (G0) and latin-iso8859-1 (G1) initially
designated. Other character sets must be explicitly designated to be
used. SS2 is used to invoke a 96-charset, one character at a time.
iso-2022-int-1iso-2022-int-1-dosiso-2022-int-1-maciso-2022-int-1-unixModeline indicator: INT-1. A type iso2022 7-bit coding system
with ascii (G0) and korean-ksc5601 (G1) initially
designated. ISO-2022-INT-1.
iso-2022-jp-1978-irviso-2022-jp-1978-irv-dosiso-2022-jp-1978-irv-maciso-2022-jp-1978-irv-unixModeline indicator: Ja-78/7bit. A type iso2022 7-bit coding
system. For compatibility with old Japanese terminals; if you need to
know, look at the source.
iso-2022-jpiso-2022-jp-2 (ISO7/SS)iso-2022-jp-dosiso-2022-jp-maciso-2022-jp-unixiso-2022-jp-2-dosiso-2022-jp-2-maciso-2022-jp-2-unixModeline indicator: MULE/7bit. A type iso2022 7-bit coding
system with ascii (G0) initially designated, and complex
specifications to insure backward compatibility with old Japanese
systems. Used for communication with mail and news in Japan. The "-2"
versions also use SS2 to invoke a 96-charset one character at a time.
iso-2022-krModeline indicator: Ko/7bit A type iso2022 7-bit coding
system with ascii (G0) and korean-ksc5601 (G1) initially
designated. Used for e-mail in Korea.
iso-2022-lockiso-2022-lock-dosiso-2022-lock-maciso-2022-lock-unixModeline indicator: ISO7/Lock. A type iso2022 7-bit coding
system with ascii (G0) initially designated, using Locking-Shift
to invoke a 96-charset.
iso-8859-1iso-8859-1-dosiso-8859-1-maciso-8859-1-unixDue to implementation, this is not a type iso2022 coding system,
but rather an alias for the raw-text coding system.
iso-8859-2iso-8859-2-dosiso-8859-2-maciso-8859-2-unixModeline indicator: MIME/Ltn-2. A type iso2022 coding
system with ascii (G0) and latin-iso8859-2 (G1) initially
invoked.
iso-8859-3iso-8859-3-dosiso-8859-3-maciso-8859-3-unixModeline indicator: MIME/Ltn-3. A type iso2022 coding system
with ascii (G0) and latin-iso8859-3 (G1) initially
invoked.
iso-8859-4iso-8859-4-dosiso-8859-4-maciso-8859-4-unixModeline indicator: MIME/Ltn-4. A type iso2022 coding system
with ascii (G0) and latin-iso8859-4 (G1) initially
invoked.
iso-8859-5iso-8859-5-dosiso-8859-5-maciso-8859-5-unixModeline indicator: ISO8/Cyr. A type iso2022 coding system with
ascii (G0) and cyrillic-iso8859-5 (G1) initially invoked.
iso-8859-7iso-8859-7-dosiso-8859-7-maciso-8859-7-unixModeline indicator: Grk. A type iso2022 coding system with
ascii (G0) and greek-iso8859-7 (G1) initially invoked.
iso-8859-8iso-8859-8-dosiso-8859-8-maciso-8859-8-unixModeline indicator: MIME/Hbrw. A type iso2022 coding system with
ascii (G0) and hebrew-iso8859-8 (G1) initially invoked.
iso-8859-9iso-8859-9-dosiso-8859-9-maciso-8859-9-unixModeline indicator: MIME/Ltn-5. A type iso2022 coding system
with ascii (G0) and latin-iso8859-9 (G1) initially
invoked.
koi8-rkoi8-r-doskoi8-r-mackoi8-r-unixModeline indicator: KOI8. A type ccl coding-system used for
KOI8-R, an encoding of the Cyrillic alphabet.
shift_jisshift_jis-dosshift_jis-macshift_jis-unixModeline indicator: Ja/SJIS. A type shift-jis coding-system
implementing the Shift-JIS encoding for Japanese. The underscore is to
conform to the MIME charset implementing this encoding.
tis-620tis-620-dostis-620-mactis-620-unixModeline indicator: TIS620. A type ccl encoding for Thai. The
external encoding is defined by TIS620, the internal encoding is
peculiar to MULE, and called thai-xtis.
viqrModeline indicator: VIQR. A type no-conversion coding
system with Unix EOL convention (ie, no conversion) using
post-read-decode and pre-write-encode functions to translate the VIQR
coding system for Vietnamese.
visciiviscii-dosviscii-macviscii-unixModeline indicator: VISCII. A type ccl coding-system used
for VISCII 1.1 for Vietnamese. Differs slightly from VSCII; VISCII is
given priority by SXEmacs.
vsciivscii-dosvscii-macvscii-unixModeline indicator: VSCII. A type ccl coding-system used
for VSCII 1.1 for Vietnamese. Differs slightly from VISCII, which is
given priority by SXEmacs. Use
(prefer-coding-system 'vietnamese-vscii) to give priority to VSCII.
Previous: Big5 and Shift-JIS Functions, Up: Coding Systems [Contents][Index]