Previous: , Up: Building SXEmacs and Object Allocation   [Contents][Index]


B.2 Garbage Collection

When a program creates a list or the user defines a new function (such as by loading a library), that data is placed in normal storage. If normal storage runs low, then SXEmacs asks the operating system to allocate more memory in blocks of 2k bytes. Each block is used for one type of Lisp object, so symbols, cons cells, markers, etc., are segregated in distinct blocks in memory. Vectors, long strings, buffers and certain other editing types, which are fairly large, are allocated in individual blocks, one per object, while small strings are packed into blocks of 8k bytes.

More correctly, a string is allocated in two sections: a fixed size chunk containing the length, list of extents, etc.; and a chunk containing the actual characters in the string. It is this latter chunk that is either allocated individually or packed into 8k blocks. The fixed size chunk is packed into 2k blocks, as for conses, markers, etc.

It is quite common to use some storage for a while, then release it by (for example) killing a buffer or deleting the last pointer to an object. SXEmacs provides a garbage collector to reclaim this abandoned storage. This name is traditional, but “garbage recycler” might be a more intuitive metaphor for this facility.

The garbage collector operates by finding and marking all Lisp objects that are still accessible to Lisp programs. To begin with, it assumes all the symbols, their values and associated function definitions, and any data presently on the stack, are accessible. Any objects that can be reached indirectly through other accessible objects are also accessible.

When marking is finished, all objects still unmarked are garbage. No matter what the Lisp program or the user does, it is impossible to refer to them, since there is no longer a way to reach them. Their space might as well be reused, since no one will miss them. The second (“sweep”) phase of the garbage collector arranges to reuse them.

The sweep phase puts unused cons cells onto a free list for future allocation; likewise for symbols, markers, extents, events, floats, compiled-function objects, and the fixed-size portion of strings. It compacts the accessible small string-chars chunks so they occupy fewer 8k blocks; then it frees the other 8k blocks. Vectors, buffers, windows, and other large objects are individually allocated and freed using malloc and free.

Common Lisp note: unlike other Lisps, SXEmacs Lisp does not call the garbage collector when the free list is empty. Instead, it simply requests the operating system to allocate more storage, and processing continues until gc-cons-threshold bytes have been used.

This means that you can make sure that the garbage collector will not run during a certain portion of a Lisp program by calling the garbage collector explicitly just before it provided that portion of the program does not use so much space as to force a second garbage collection.

Command: garbage-collect

This command runs a garbage collection, and returns information on the amount of space in use.

Garbage collection can also occur spontaneously if you use more than gc-cons-threshold bytes of Lisp data since the previous garbage collection.

garbage-collect returns a list containing the following information:

((used-conses . free-conses)
 (used-syms . free-syms)
 (used-markers . free-markers)
 used-string-chars
 used-vector-slots
 (plist))

⇒ ((73362 . 8325) (13718 . 164)
(5089 . 5098) 949121 118677
(conses-used 73362 conses-free 8329 cons-storage 658168
symbols-used 13718 symbols-free 164 symbol-storage 335216
bit-vectors-used 0 bit-vectors-total-length 0
bit-vector-storage 0 vectors-used 7882
vectors-total-length 118677 vector-storage 537764
compiled-functions-used 1336 compiled-functions-free 37
compiled-function-storage 44440 short-strings-used 28829
long-strings-used 2 strings-free 7722
short-strings-total-length 916657 short-string-storage 1179648
long-strings-total-length 32464 string-header-storage 441504
floats-used 3 floats-free 43 float-storage 2044 markers-used 5089
markers-free 5098 marker-storage 245280 events-used 103
events-free 835 event-storage 110656 extents-used 10519
extents-free 2718 extent-storage 372736
extent-auxiliarys-used 111 extent-auxiliarys-freed 3
extent-auxiliary-storage 4440 window-configurations-used 39
window-configurations-on-free-list 5
window-configurations-freed 10 window-configuration-storage 9492
popup-datas-used 3 popup-data-storage 72 toolbar-buttons-used 62
toolbar-button-storage 4960 toolbar-datas-used 12
toolbar-data-storage 240 symbol-value-buffer-locals-used 182
symbol-value-buffer-local-storage 5824
symbol-value-lisp-magics-used 22
symbol-value-lisp-magic-storage 1496
symbol-value-varaliases-used 43
symbol-value-varalias-storage 1032 opaque-lists-used 2
opaque-list-storage 48 color-instances-used 12
color-instance-storage 288 font-instances-used 5
font-instance-storage 180 opaques-used 11 opaque-storage 312
range-tables-used 1 range-table-storage 16 faces-used 34
face-storage 2584 glyphs-used 124 glyph-storage 4464
specifiers-used 775 specifier-storage 43869 weak-lists-used 786
weak-list-storage 18864 char-tables-used 40
char-table-storage 41920 buffers-used 25 buffer-storage 7000
extent-infos-used 457 extent-infos-freed 73
extent-info-storage 9140 keymaps-used 275 keymap-storage 12100
consoles-used 4 console-storage 384 command-builders-used 2
command-builder-storage 120 devices-used 2 device-storage 344
frames-used 3 frame-storage 624 image-instances-used 47
image-instance-storage 3008 windows-used 27 windows-freed 2
window-storage 9180 lcrecord-lists-used 15
lcrecord-list-storage 360 hash-tables-used 631
hash-table-storage 25240 streams-used 1 streams-on-free-list 3
streams-freed 12 stream-storage 91))

Here is a table explaining each element:

used-conses

The number of cons cells in use.

free-conses

The number of cons cells for which space has been obtained from the operating system, but that are not currently being used.

used-syms

The number of symbols in use.

free-syms

The number of symbols for which space has been obtained from the operating system, but that are not currently being used.

used-markers

The number of markers in use.

free-markers

The number of markers for which space has been obtained from the operating system, but that are not currently being used.

used-string-chars

The total size of all strings, in characters.

used-vector-slots

The total number of elements of existing vectors.

plist

A list of alternating keyword/value pairs providing more detailed information. (As you can see above, quite a lot of information is provided.)

User Option: gc-cons-threshold

The value of this variable is the number of bytes of storage that must be allocated for Lisp objects after one garbage collection in order to trigger another garbage collection.

A cons cell counts as eight bytes, a string as one byte per character plus a few bytes of overhead, and so on; space allocated to the contents of buffers does not count. Note that the subsequent garbage collection does not happen immediately when the threshold is exhausted, but only the next time the Lisp evaluator is called.

The initial threshold value is 500,000. If you specify a larger value, garbage collection will happen less often. This reduces the amount of time spent garbage collecting, but increases total memory use. You may want to do this when running a program that creates lots of Lisp data.

You can make collections more frequent by specifying a smaller value, down to 10,000. A value less than 10,000 will remain in effect only until the subsequent garbage collection, at which time garbage-collect will set the threshold back to 10,000.

Note: This does not apply if SXEmacs was configured with ‘--debug’. Therefore, be careful when setting gc-cons-threshold in that case!

Variable: pre-gc-hook

This is a normal hook to be run just before each garbage collection. Interrupts, garbage collection, and errors are inhibited while this hook runs, so be extremely careful in what you add here.

In particular, avoid consing, and do not interact with the user!

Variable: post-gc-hook

This is a normal hook to be run just after each garbage collection. Interrupts, garbage collection, and errors are inhibited while this hook runs, so be extremely careful in what you add here.

In particular, avoid consing, and do not interact with the user!

Variable: gc-message

This is a string to print to indicate that a garbage collection is in progress. This is printed in the echo area. If the selected frame is on a window system and gc-pointer-glyph specifies a value (i.e. a pointer image instance) in the domain of the selected frame, the mouse cursor will change instead of this message being printed.

Glyph: gc-pointer-glyph

This holds the pointer glyph used to indicate that a garbage collection is in progress. If the selected window is on a window system and this glyph specifies a value (i.e. a pointer image instance) in the domain of the selected window, the cursor will be changed as specified during garbage collection. Otherwise, a message will be printed in the echo area, as controlled by gc-message. See Glyphs.

If SXEmacs was configured with ‘--debug’, you can set the following two variables to get direct information about all the allocation that is happening in a segment of Lisp code.

Variable: debug-allocation

If non-zero, print out information to stderr about all objects allocated.

Variable: debug-allocation-backtrace

Length (in stack frames) of short backtrace printed out by debug-allocation.


Previous: , Up: Building SXEmacs and Object Allocation   [Contents][Index]