SXEmacs Internals Manual: Modules for Other Aspects of the Lisp Interpreter and Object System

10.8 Modules for Other Aspects of the Lisp Interpreter and Object System

elhash.c
elhash.h
hash.c
hash.h

These files provide two implementations of hash tables. Files hash.c and hash.h provide a generic C implementation of hash tables which can stand independently of SXEmacs. Files elhash.c and elhash.h provide a separate implementation of hash tables that can store only Lisp objects, and knows about Lispy things like garbage collection, and implement the hash-table Lisp object type.

specifier.c
specifier.h

This module implements the specifier Lisp object type. This is primarily used for displayable properties, and allows for values that are specific to a particular buffer, window, frame, device, or device class, as well as a default value existing. This is used, for example, to control the height of the horizontal scrollbar or the appearance of the default, bold, or other faces. The specifier object consists of a number of specifications, each of which maps from a buffer, window, etc. to a value. The function specifier-instance looks up a value given a window (from which a buffer, frame, and device can be derived).

chartab.c
chartab.h
casetab.c

chartab.c and chartab.h implement the char table Lisp object type, which maps from characters or certain sorts of character ranges to Lisp objects. The implementation of this object type is optimized for the internal representation of characters. Char tables come in different types, which affect the allowed object types to which a character can be mapped and also dictate certain other properties of the char table.

casetab.c implements one sort of char table, the case table, which maps characters to other characters of possibly different case. These are used by SXEmacs to implement case-changing primitives and to do case-insensitive searching.

syntax.c
syntax.h

This module implements syntax tables, another sort of char table that maps characters into syntax classes that define the syntax of these characters (e.g. a parenthesis belongs to a class of ‘open’ characters that have corresponding ‘close’ characters and can be nested). This module also implements the Lisp scanner, a set of primitives for scanning over text based on syntax tables. This is used, for example, to find the matching parenthesis in a command such as forward-sexp, and by font-lock.c to locate quoted strings, comments, etc.

Syntax codes are implemented as bitfields in an int. Bits 0-6 contain the syntax code itself, bit 7 is a special prefix flag used for Lisp, and bits 16-23 contain comment syntax flags. From the Lisp programmer’s point of view, there are 11 flags: 2 styles X 2 characters X {start, end} flags for two-character comment delimiters, 2 style flags for one-character comment delimiters, and the prefix flag.

Internally, however, the characters used in multi-character delimiters will have non-comment-character syntax classes (e.g., the ‘/’ in C’s ‘/*’ comment-start delimiter has “punctuation” (here meaning “operator-like”) class in C modes). Thus in a mixed comment style, such as C++’s ‘//’ to end of line, is represented by giving ‘/’ the “punctuation” class and the “style b first character of start sequence” and “style b second character of start sequence” flags. The fact that class is not punctuation allows the syntax scanner to recognize that this is a multi-character delimiter. The ‘newline’ character is given (single-character) “comment-end” class and the “style b first character of end sequence” flag. The “comment-end” class allows the scanner to determine that no second character is needed to terminate the comment.

There used to be a syntax class ‘Sextword’. A character of ‘Sextword’ class is a word-constituent but a word boundary may exist between two such characters. Ken’ichi HANDA <handa@etl.go.jp> explains the purpose of the Sextword syntax category:

Japanese words are not separated by spaces, which makes finding word boundaries very difficult. Theoretically it’s impossible without using natural language processing techniques. But, by defining pseudo-words as below (much simplified for letting you understand it easily) for Japanese, we can have a convenient forward-word function for Japanese.
A Japanese word is a sequence of characters that consists of
zero or more Kanji characters followed by zero or more
Hiragana characters.
Then, the problem is that now we can’t say that a sequence of word-constituents makes up a word. For instance, both Hiragana "A" and Kanji "KAN" are word-constituents but the sequence of these two letters can’t be a single word.

So, we introduced Sextword for Japanese letters.

There seems to have been some controversy about this category, as it has been removed, readded, and removed again. Currently neither GNU Emacs (21.3.99) nor XEmacs (21.5.17) seems to use it.

casefiddle.c

This module implements various Lisp primitives for upcasing, downcasing and capitalizing strings or regions of buffers.

rangetab.c

This module implements the range table Lisp object type, which provides for a mapping from ranges of integers to arbitrary Lisp objects.

opaque.c
opaque.h

This module implements the opaque Lisp object type, an internal-only Lisp object that encapsulates an arbitrary block of memory so that it can be managed by the Lisp allocation system. To create an opaque object, you call make_opaque(), passing a pointer to a block of memory. An object is created that is big enough to hold the memory, which is copied into the object’s storage. The object will then stick around as long as you keep pointers to it, after which it will be automatically reclaimed.

Opaque objects can also have an arbitrary mark method associated with them, in case the block of memory contains other Lisp objects that need to be marked for garbage-collection purposes. (If you need other object methods, such as a finalize method, you should just go ahead and create a new Lisp object type—it’s not hard.)

abbrev.c

This function provides a few primitives for doing dynamic abbreviation expansion. In SXEmacs, most of the code for this has been moved into Lisp. Some C code remains for speed and because the primitive self-insert-command (which is executed for all self-inserting characters) hooks into the abbrev mechanism. (self-insert-command is itself in C only for speed.)

doc.c

This function provides primitives for retrieving the documentation strings of functions and variables. These documentation strings contain certain special markers that get dynamically expanded (e.g. a reverse-lookup is performed on some named functions to retrieve their current key bindings). Some documentation strings (in particular, for the built-in primitives and pre-loaded Lisp functions) are stored externally in a file DOC in the lib-src/ directory and need to be fetched from that file. (Part of the build stage involves building this file, and another part involves constructing an index for this file and embedding it into the executable, so that the functions in doc.c do not have to search the entire DOC file to find the appropriate documentation string.)

md5.c

This function provides a Lisp primitive that implements the MD5 secure hashing scheme, used to create a large hash value of a string of data such that the data cannot be derived from the hash value. This is used for various security applications on the Internet.