27.5.1 Information About Each Character
The syntax table entry for a character is a number that encodes six
pieces of information:
- The syntactic class of the character, represented as a small integer
- The matching delimiter, for delimiter characters only
(the matching delimiter of ‘(’ is ‘)’, and vice versa)
- A flag saying whether the character is the first character of a
two-character comment starting sequence
- A flag saying whether the character is the second character of a
two-character comment starting sequence
- A flag saying whether the character is the first character of a
two-character comment ending sequence
- A flag saying whether the character is the second character of a
two-character comment ending sequence
The syntactic classes are stored internally as small integers, but are
usually described to or by the user with characters. For example, ‘(’
is used to specify the syntactic class of opening delimiters. Here is a
table of syntactic classes, with the characters that specify them.
- ‘-’
- The class of whitespace characters. Please don't use the formerly
advertised , which is not supported by GNU Emacs.
- ‘w’
- The class of word-constituent characters.
- ‘_’
- The class of characters that are part of symbol names but not words.
This class is represented by ‘_’ because the character ‘_’
has this class in both C and Lisp.
- ‘.’
- The class of punctuation characters that do not fit into any other
special class.
- ‘(’
- The class of opening delimiters.
- ‘)’
- The class of closing delimiters.
- ‘'’
- The class of expression-adhering characters. These characters are
part of a symbol if found within or adjacent to one, and are part
of a following expression if immediately preceding one, but are like
whitespace if surrounded by whitespace.
- ‘"’
- The class of string-quote characters. They match each other in pairs,
and the characters within the pair all lose their syntactic
significance except for the ‘\’ and ‘/’ classes of escape
characters, which can be used to include a string-quote inside the
string.
- ‘$’
- The class of self-matching delimiters. This is intended for TeX's
‘$’, which is used both to enter and leave math mode. Thus,
a pair of matching ‘$’ characters surround each piece of math mode
TeX input. A pair of adjacent ‘$’ characters act like a single
one for purposes of matching.
- ‘/’
- The class of escape characters that always just deny the following
character its special syntactic significance. The character after one
of these escapes is always treated as alphabetic.
- ‘\’
- The class of C-style escape characters. In practice, these are
treated just like ‘/’-class characters, because the extra
possibilities for C escapes (such as being followed by digits) have no
effect on where the containing expression ends.
- ‘<’
- The class of comment-starting characters. Only single-character
comment starters (such as ‘;’ in Lisp mode) are represented this
way.
- ‘>’
- The class of comment-ending characters. Newline has this syntax in
Lisp mode.
The characters flagged as part of two-character comment delimiters can
have other syntactic functions most of the time. For example, ‘/’ and
‘*’ in C code, when found separately, have nothing to do with
comments. The comment-delimiter significance overrides when the pair of
characters occur together in the proper order. Only the list and sexp
commands use the syntax table to find comments; the commands specifically
for comments have other variables that tell them where to find comments.
Moreover, the list and sexp commands notice comments only if
parse-sexp-ignore-comments is non-nil. This variable is set
to nil in modes where comment-terminator sequences are liable to
appear where there is no comment, for example, in Lisp mode where the
comment terminator is a newline but not every newline ends a comment.