[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

11. Lists

A list represents a sequence of zero or more elements (which may be any Lisp objects). The important difference between lists and vectors is that two or more lists can share part of their structure; in addition, you can insert or delete elements in a list without copying the whole list.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

11.1 Lists and Cons Cells

Lists in Lisp are not a primitive data type; they are built up from cons cells. A cons cell is a data object that represents an ordered pair. It records two Lisp objects, one labeled as the CAR, and the other labeled as the CDR. These names are traditional; see Cons Cell and List Types. CDR is pronounced "could-er."

A list is a series of cons cells chained together, one cons cell per element of the list. By convention, the CARs of the cons cells are the elements of the list, and the CDRs are used to chain the list: the CDR of each cons cell is the following cons cell. The CDR of the last cons cell is nil. This asymmetry between the CAR and the CDR is entirely a matter of convention; at the level of cons cells, the CAR and CDR slots have the same characteristics.

Because most cons cells are used as part of lists, the phrase list structure has come to mean any structure made out of cons cells.

The symbol nil is considered a list as well as a symbol; it is the list with no elements. For convenience, the symbol nil is considered to have nil as its CDR (and also as its CAR).

The CDR of any nonempty list l is a list containing all the elements of l except the first.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

11.2 Lists as Linked Pairs of Boxes

A cons cell can be illustrated as a pair of boxes. The first box represents the CAR and the second box represents the CDR. Here is an illustration of the two-element list, (tulip lily), made from two cons cells:

 
 ---------------         ---------------
| car   | cdr   |       | car   | cdr   |
| tulip |   o---------->| lily  |  nil  |
|       |       |       |       |       |
 ---------------         ---------------

Each pair of boxes represents a cons cell. Each box "refers to", "points to" or "contains" a Lisp object. (These terms are synonymous.) The first box, which is the CAR of the first cons cell, contains the symbol tulip. The arrow from the CDR of the first cons cell to the second cons cell indicates that the CDR of the first cons cell points to the second cons cell.

The same list can be illustrated in a different sort of box notation like this:

 
    ___ ___      ___ ___
   |___|___|--> |___|___|--> nil
     |            |
     |            |
      --> tulip    --> lily

Here is a more complex illustration, showing the three-element list, ((pine needles) oak maple), the first element of which is a two-element list:

 
    ___ ___      ___ ___      ___ ___
   |___|___|--> |___|___|--> |___|___|--> nil
     |            |            |
     |            |            |
     |             --> oak      --> maple
     |
     |     ___ ___      ___ ___
      --> |___|___|--> |___|___|--> nil
            |            |
            |            |
             --> pine     --> needles

The same list represented in the first box notation looks like this:

 
 --------------       --------------       --------------
| car   | cdr  |     | car   | cdr  |     | car   | cdr  |
|   o   |   o------->| oak   |   o------->| maple |  nil |
|   |   |      |     |       |      |     |       |      |
 -- | ---------       --------------       --------------
    |
    |
    |        --------------       ----------------
    |       | car   | cdr  |     | car     | cdr  |
     ------>| pine  |   o------->| needles |  nil |
            |       |      |     |         |      |
             --------------       ----------------

See section Cons Cell and List Types, for the read and print syntax of cons cells and lists, and for more "box and arrow" illustrations of lists.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

11.3 Predicates on Lists

The following predicates test whether a Lisp object is an atom, is a cons cell or is a list, or whether it is the distinguished object nil. (Many of these predicates can be defined in terms of the others, but they are used so often that it is worth having all of them.)

Function: consp object

This function returns t if object is a cons cell, nil otherwise. nil is not a cons cell, although it is a list.

Function: atom object

This function returns t if object is an atom, nil otherwise. All objects except cons cells are atoms. The symbol nil is an atom and is also a list; it is the only Lisp object that is both.

 
(atom object) ≡ (not (consp object))
Function: listp object

This function returns t if object is a cons cell or nil. Otherwise, it returns nil.

 
(listp '(1))
     ⇒ t
(listp '())
     ⇒ t
Function: nlistp object

This function is the opposite of listp: it returns t if object is not a list. Otherwise, it returns nil.

 
(listp object) ≡ (not (nlistp object))
Function: null object

This function returns t if object is nil, and returns nil otherwise. This function is identical to not, but as a matter of clarity we use null when object is considered a list and not when it is considered a truth value (see not in Constructs for Combining Conditions).

 
(null '(1))
     ⇒ nil
(null '())
     ⇒ t

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

11.4 Accessing Elements of Lists

Function: car cons-cell

This function returns the value pointed to by the first pointer of the cons cell cons-cell. Expressed another way, this function returns the CAR of cons-cell.

As a special case, if cons-cell is nil, then car is defined to return nil; therefore, any list is a valid argument for car. An error is signaled if the argument is not a cons cell or nil.

 
(car '(a b c))
     ⇒ a
(car '())
     ⇒ nil
Function: cdr cons-cell

This function returns the value pointed to by the second pointer of the cons cell cons-cell. Expressed another way, this function returns the CDR of cons-cell.

As a special case, if cons-cell is nil, then cdr is defined to return nil; therefore, any list is a valid argument for cdr. An error is signaled if the argument is not a cons cell or nil.

 
(cdr '(a b c))
     ⇒ (b c)
(cdr '())
     ⇒ nil
Function: car-safe object

This function lets you take the CAR of a cons cell while avoiding errors for other data types. It returns the CAR of object if object is a cons cell, nil otherwise. This is in contrast to car, which signals an error if object is not a list.

 
(car-safe object)
≡
(let ((x object))
  (if (consp x)
      (car x)
    nil))
Function: cdr-safe object

This function lets you take the CDR of a cons cell while avoiding errors for other data types. It returns the CDR of object if object is a cons cell, nil otherwise. This is in contrast to cdr, which signals an error if object is not a list.

 
(cdr-safe object)
≡
(let ((x object))
  (if (consp x)
      (cdr x)
    nil))
Function: nth n list

This function returns the nth element of list. Elements are numbered starting with zero, so the CAR of list is element number zero. If the length of list is n or less, the value is nil.

If n is negative, nth returns the first element of list.

 
(nth 2 '(1 2 3 4))
     ⇒ 3
(nth 10 '(1 2 3 4))
     ⇒ nil
(nth -3 '(1 2 3 4))
     ⇒ 1

(nth n x) ≡ (car (nthcdr n x))
Function: nthcdr n list

This function returns the nth CDR of list. In other words, it removes the first n links of list and returns what follows.

If n is zero or negative, nthcdr returns all of list. If the length of list is n or less, nthcdr returns nil.

 
(nthcdr 1 '(1 2 3 4))
     ⇒ (2 3 4)
(nthcdr 10 '(1 2 3 4))
     ⇒ nil
(nthcdr -3 '(1 2 3 4))
     ⇒ (1 2 3 4)

Many convenience functions are provided to make it easier for you to access particular elements in a nested list. All of these can be rewritten in terms of the functions just described.

Function: caar cons-cell
Function: cadr cons-cell
Function: cdar cons-cell
Function: cddr cons-cell
Function: caaar cons-cell
Function: caadr cons-cell
Function: cadar cons-cell
Function: caddr cons-cell
Function: cdaar cons-cell
Function: cdadr cons-cell
Function: cddar cons-cell
Function: cdddr cons-cell
Function: caaaar cons-cell
Function: caaadr cons-cell
Function: caadar cons-cell
Function: caaddr cons-cell
Function: cadaar cons-cell
Function: cadadr cons-cell
Function: caddar cons-cell
Function: cadddr cons-cell
Function: cdaaar cons-cell
Function: cdaadr cons-cell
Function: cdadar cons-cell
Function: cdaddr cons-cell
Function: cddaar cons-cell
Function: cddadr cons-cell
Function: cdddar cons-cell
Function: cddddr cons-cell

Each of these functions is equivalent to one or more applications of car and/or cdr. For example,

 
(cadr x)

is equivalent to

 
(car (cdr x))

and

 
(cdaddr x)

is equivalent to

 
(cdr (car (cdr (cdr x))))

That is to say, read the a's and d's from right to left and apply a car or cdr for each a or d found, respectively.

Function: first list

This is equivalent to (nth 0 list), i.e. the first element of list. (Note that this is also equivalent to car.)

Function: second list

This is equivalent to (nth 1 list), i.e. the second element of list.

Function: third list
Function: fourth list
Function: fifth list
Function: sixth list
Function: seventh list
Function: eighth list
Function: ninth list
Function: tenth list

These are equivalent to (nth 2 list) through (nth 9 list) respectively, i.e. the third through tenth elements of list.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

11.5 Building Cons Cells and Lists

Many functions build lists, as lists reside at the very heart of Lisp. cons is the fundamental list-building function; however, it is interesting to note that list is used more times in the source code for SXEmacs than cons.

Function: cons object1 object2

This function is the fundamental function used to build new list structure. It creates a new cons cell, making object1 the CAR, and object2 the CDR. It then returns the new cons cell. The arguments object1 and object2 may be any Lisp objects, but most often object2 is a list.

 
(cons 1 '(2))
     ⇒ (1 2)
(cons 1 '())
     ⇒ (1)
(cons 1 2)
     ⇒ (1 . 2)

cons is often used to add a single element to the front of a list. This is called consing the element onto the list. For example:

 
(setq list (cons newelt list))

Note that there is no conflict between the variable named list used in this example and the function named list described below; any symbol can serve both purposes.

Function: list &rest objects

This function creates a list with objects as its elements. The resulting list is always nil-terminated. If no objects are given, the empty list is returned.

 
(list 1 2 3 4 5)
     ⇒ (1 2 3 4 5)
(list 1 2 '(3 4 5) 'foo)
     ⇒ (1 2 (3 4 5) foo)
(list)
     ⇒ nil
Function: make-list length object

This function creates a list of length length, in which all the elements have the identical value object. Compare make-list with make-string (see section Creating Strings).

 
(make-list 3 'pigs)
     ⇒ (pigs pigs pigs)
(make-list 0 'pigs)
     ⇒ nil
Function: append &rest sequences

This function returns a list containing all the elements of sequences. The sequences may be lists, vectors, or strings, but the last one should be a list. All arguments except the last one are copied, so none of them are altered.

More generally, the final argument to append may be any Lisp object. The final argument is not copied or converted; it becomes the CDR of the last cons cell in the new list. If the final argument is itself a list, then its elements become in effect elements of the result list. If the final element is not a list, the result is a "dotted list" since its final CDR is not nil as required in a true list.

See nconc in Functions that Rearrange Lists, for a way to join lists with no copying.

Here is an example of using append:

 
(setq trees '(pine oak))
     ⇒ (pine oak)
(setq more-trees (append '(maple birch) trees))
     ⇒ (maple birch pine oak)

trees
     ⇒ (pine oak)
more-trees
     ⇒ (maple birch pine oak)
(eq trees (cdr (cdr more-trees)))
     ⇒ t

You can see how append works by looking at a box diagram. The variable trees is set to the list (pine oak) and then the variable more-trees is set to the list (maple birch pine oak). However, the variable trees continues to refer to the original list:

 
more-trees                trees
|                           |
|     ___ ___      ___ ___   -> ___ ___      ___ ___
 --> |___|___|--> |___|___|--> |___|___|--> |___|___|--> nil
       |            |            |            |
       |            |            |            |
        --> maple    -->birch     --> pine     --> oak

An empty sequence contributes nothing to the value returned by append. As a consequence of this, a final nil argument forces a copy of the previous argument.

 
trees
     ⇒ (pine oak)
(setq wood (append trees ()))
     ⇒ (pine oak)
wood
     ⇒ (pine oak)
(eq wood trees)
     ⇒ nil

This once was the usual way to copy a list, before the function copy-sequence was invented. See section Sequences, Arrays, and Vectors.

With the help of apply, we can append all the lists in a list of lists:

 
(apply 'append '((a b c) nil (x y z) nil))
     ⇒ (a b c x y z)

If no sequences are given, nil is returned:

 
(append)
     ⇒ nil

Here are some examples where the final argument is not a list:

 
(append '(x y) 'z)
     ⇒ (x y . z)
(append '(x y) [z])
     ⇒ (x y . [z])

The second example shows that when the final argument is a sequence but not a list, the sequence's elements do not become elements of the resulting list. Instead, the sequence becomes the final CDR, like any other non-list final argument.

The append function also allows integers as arguments. It converts them to strings of digits, making up the decimal print representation of the integer, and then uses the strings instead of the original integers. Don't use this feature; we plan to eliminate it. If you already use this feature, change your programs now! The proper way to convert an integer to a decimal number in this way is with format (see section Formatting Strings) or number-to-string (see section Conversion of Characters and Strings).

Function: reverse list

This function creates a new list whose elements are the elements of list, but in reverse order. The original argument list is not altered.

 
(setq x '(1 2 3 4))
     ⇒ (1 2 3 4)
(reverse x)
     ⇒ (4 3 2 1)
x
     ⇒ (1 2 3 4)

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

11.6 Modifying Existing List Structure

You can modify the CAR and CDR contents of a cons cell with the primitives setcar and setcdr.

Common Lisp note: Common Lisp uses functions rplaca and rplacd to alter list structure; they change structure the same way as setcar and setcdr, but the Common Lisp functions return the cons cell while setcar and setcdr return the new CAR or CDR.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

11.6.1 Altering List Elements with setcar

Changing the CAR of a cons cell is done with setcar. When used on a list, setcar replaces one element of a list with a different element.

Function: setcar cons-cell object

This function stores object as the new CAR of cons-cell, replacing its previous CAR. It returns the value object. For example:

 
(setq x '(1 2))
     ⇒ (1 2)
(setcar x 4)
     ⇒ 4
x
     ⇒ (4 2)

When a cons cell is part of the shared structure of several lists, storing a new CAR into the cons changes one element of each of these lists. Here is an example:

 
;; Create two lists that are partly shared.
(setq x1 '(a b c))
     ⇒ (a b c)
(setq x2 (cons 'z (cdr x1)))
     ⇒ (z b c)

;; Replace the CAR of a shared link.
(setcar (cdr x1) 'foo)
     ⇒ foo
x1                           ; Both lists are changed.
     ⇒ (a foo c)
x2
     ⇒ (z foo c)

;; Replace the CAR of a link that is not shared.
(setcar x1 'baz)
     ⇒ baz
x1                           ; Only one list is changed.
     ⇒ (baz foo c)
x2
     ⇒ (z foo c)

Here is a graphical depiction of the shared structure of the two lists in the variables x1 and x2, showing why replacing b changes them both:

 
        ___ ___        ___ ___      ___ ___
x1---> |___|___|----> |___|___|--> |___|___|--> nil
         |        -->   |            |
         |       |      |            |
          --> a  |       --> b        --> c
                 |
       ___ ___   |
x2--> |___|___|--
        |
        |
         --> z

Here is an alternative form of box diagram, showing the same relationship:

 
x1:
 --------------       --------------       --------------
| car   | cdr  |     | car   | cdr  |     | car   | cdr  |
|   a   |   o------->|   b   |   o------->|   c   |  nil |
|       |      |  -->|       |      |     |       |      |
 --------------  |    --------------       --------------
                 |
x2:              |
 --------------  |
| car   | cdr  | |
|   z   |   o----
|       |      |
 --------------

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

11.6.2 Altering the CDR of a List

The lowest-level primitive for modifying a CDR is setcdr:

Function: setcdr cons-cell object

This function stores object as the new CDR of cons-cell, replacing its previous CDR. It returns the value object.

Here is an example of replacing the CDR of a list with a different list. All but the first element of the list are removed in favor of a different sequence of elements. The first element is unchanged, because it resides in the CAR of the list, and is not reached via the CDR.

 
(setq x '(1 2 3))
     ⇒ (1 2 3)
(setcdr x '(4))
     ⇒ (4)
x
     ⇒ (1 4)

You can delete elements from the middle of a list by altering the CDRs of the cons cells in the list. For example, here we delete the second element, b, from the list (a b c), by changing the CDR of the first cell:

 
(setq x1 '(a b c))
     ⇒ (a b c)
(setcdr x1 (cdr (cdr x1)))
     ⇒ (c)
x1
     ⇒ (a c)

Here is the result in box notation:

 
                   --------------------
                  |                    |
 --------------   |   --------------   |    --------------
| car   | cdr  |  |  | car   | cdr  |   -->| car   | cdr  |
|   a   |   o-----   |   b   |   o-------->|   c   |  nil |
|       |      |     |       |      |      |       |      |
 --------------       --------------        --------------

The second cons cell, which previously held the element b, still exists and its CAR is still b, but it no longer forms part of this list.

It is equally easy to insert a new element by changing CDRs:

 
(setq x1 '(a b c))
     ⇒ (a b c)
(setcdr x1 (cons 'd (cdr x1)))
     ⇒ (d b c)
x1
     ⇒ (a d b c)

Here is this result in box notation:

 
 --------------        -------------       -------------
| car  | cdr   |      | car  | cdr  |     | car  | cdr  |
|   a  |   o   |   -->|   b  |   o------->|   c  |  nil |
|      |   |   |  |   |      |      |     |      |      |
 --------- | --   |    -------------       -------------
           |      |
     -----         --------
    |                      |
    |    ---------------   |
    |   | car   | cdr   |  |
     -->|   d   |   o------
        |       |       |
         ---------------

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

11.6.3 Functions that Rearrange Lists

Here are some functions that rearrange lists "destructively" by modifying the CDRs of their component cons cells. We call these functions "destructive" because they chew up the original lists passed to them as arguments, to produce a new list that is the returned value.

Function: nconc &rest lists

This function returns a list containing all the elements of lists. Unlike append (see section Building Cons Cells and Lists), the lists are not copied. Instead, the last CDR of each of the lists is changed to refer to the following list. The last of the lists is not altered. For example:

 
(setq x '(1 2 3))
     ⇒ (1 2 3)
(nconc x '(4 5))
     ⇒ (1 2 3 4 5)
x
     ⇒ (1 2 3 4 5)

Since the last argument of nconc is not itself modified, it is reasonable to use a constant list, such as '(4 5), as in the above example. For the same reason, the last argument need not be a list:

 
(setq x '(1 2 3))
     ⇒ (1 2 3)
(nconc x 'z)
     ⇒ (1 2 3 . z)
x
     ⇒ (1 2 3 . z)

A common pitfall is to use a quoted constant list as a non-last argument to nconc. If you do this, your program will change each time you run it! Here is what happens:

 
(defun add-foo (x)            ; We want this function to add
  (nconc '(foo) x))           ;   foo to the front of its arg.

(symbol-function 'add-foo)
     ⇒ (lambda (x) (nconc (quote (foo)) x))

(setq xx (add-foo '(1 2)))    ; It seems to work.
     ⇒ (foo 1 2)
(setq xy (add-foo '(3 4)))    ; What happened?
     ⇒ (foo 1 2 3 4)
(eq xx xy)
     ⇒ t

(symbol-function 'add-foo)
     ⇒ (lambda (x) (nconc (quote (foo 1 2 3 4) x)))
Function: nreverse list

This function reverses the order of the elements of list. Unlike reverse, nreverse alters its argument by reversing the CDRs in the cons cells forming the list. The cons cell that used to be the last one in list becomes the first cell of the value.

For example:

 
(setq x '(1 2 3 4))
     ⇒ (1 2 3 4)
x
     ⇒ (1 2 3 4)
(nreverse x)
     ⇒ (4 3 2 1)
;; The cell that was first is now last.
x
     ⇒ (1)

To avoid confusion, we usually store the result of nreverse back in the same variable which held the original list:

 
(setq x (nreverse x))

Here is the nreverse of our favorite example, (a b c), presented graphically:

 
Original list head:                       Reversed list:
 -------------        -------------        ------------
| car  | cdr  |      | car  | cdr  |      | car | cdr  |
|   a  |  nil |<--   |   b  |   o  |<--   |   c |   o  |
|      |      |   |  |      |   |  |   |  |     |   |  |
 -------------    |   --------- | -    |   -------- | -
                  |             |      |            |
                   -------------        ------------
Function: sort list predicate

This function sorts list stably, though destructively, and returns the sorted list. It compares elements using predicate. A stable sort is one in which elements with equal sort keys maintain their relative order before and after the sort. Stability is important when successive sorts are used to order elements according to different criteria.

The argument predicate must be a function that accepts two arguments. It is called with two elements of list. To get an increasing order sort, the predicate should return t if the first element is "less than" the second, or nil if not.

The destructive aspect of sort is that it rearranges the cons cells forming list by changing CDRs. A nondestructive sort function would create new cons cells to store the elements in their sorted order. If you wish to make a sorted copy without destroying the original, copy it first with copy-sequence and then sort.

Sorting does not change the CARs of the cons cells in list; the cons cell that originally contained the element a in list still has a in its CAR after sorting, but it now appears in a different position in the list due to the change of CDRs. For example:

 
(setq nums '(1 3 2 6 5 4 0))
     ⇒ (1 3 2 6 5 4 0)
(sort nums '<)
     ⇒ (0 1 2 3 4 5 6)
nums
     ⇒ (1 2 3 4 5 6)

Note that the list in nums no longer contains 0; this is the same cons cell that it was before, but it is no longer the first one in the list. Don't assume a variable that formerly held the argument now holds the entire sorted list! Instead, save the result of sort and use that. Most often we store the result back into the variable that held the original list:

 
(setq nums (sort nums '<))

See section Sorting Text, for more functions that perform sorting. See documentation in Access to Documentation Strings, for a useful example of sort.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

11.7 Using Lists as Sets

A list can represent an unordered mathematical set--simply consider a value an element of a set if it appears in the list, and ignore the order of the list. To form the union of two sets, use append (as long as you don't mind having duplicate elements). Other useful functions for sets include memq and delq, and their equal versions, member and delete.

Common Lisp note: Common Lisp has functions union (which avoids duplicate elements) and intersection for set operations, but SXEmacs Lisp does not have them. You can write them in Lisp if you wish.

Function: memq object list

This function tests to see whether object is a member of list. If it is, memq returns a list starting with the first occurrence of object. Otherwise, it returns nil. The letter `q' in memq says that it uses eq to compare object against the elements of the list. For example:

 
(memq 'b '(a b c b a))
     ⇒ (b c b a)
(memq '(2) '((1) (2)))    ; (2) and (2) are not eq.
     ⇒ nil
Function: delq object list

This function destructively removes all elements eq to object from list. The letter `q' in delq says that it uses eq to compare object against the elements of the list, like memq.

When delq deletes elements from the front of the list, it does so simply by advancing down the list and returning a sublist that starts after those elements:

 
(delq 'a '(a b c)) ≡ (cdr '(a b c))

When an element to be deleted appears in the middle of the list, removing it involves changing the CDRs (see section Altering the CDR of a List).

 
(setq sample-list '(a b c (4)))
     ⇒ (a b c (4))
(delq 'a sample-list)
     ⇒ (b c (4))
sample-list
     ⇒ (a b c (4))
(delq 'c sample-list)
     ⇒ (a b (4))
sample-list
     ⇒ (a b (4))

Note that (delq 'c sample-list) modifies sample-list to splice out the third element, but (delq 'a sample-list) does not splice anything--it just returns a shorter list. Don't assume that a variable which formerly held the argument list now has fewer elements, or that it still holds the original list! Instead, save the result of delq and use that. Most often we store the result back into the variable that held the original list:

 
(setq flowers (delq 'rose flowers))

In the following example, the (4) that delq attempts to match and the (4) in the sample-list are not eq:

 
(delq '(4) sample-list)
     ⇒ (a c (4))

The following two functions are like memq and delq but use equal rather than eq to compare elements. They are new in Emacs 19.

Function: member object list

The function member tests to see whether object is a member of list, comparing members with object using equal. If object is a member, member returns a list starting with its first occurrence in list. Otherwise, it returns nil.

Compare this with memq:

 
(member '(2) '((1) (2)))  ; (2) and (2) are equal.
     ⇒ ((2))
(memq '(2) '((1) (2)))    ; (2) and (2) are not eq.
     ⇒ nil
;; Two strings with the same contents are equal.
(member "foo" '("foo" "bar"))
     ⇒ ("foo" "bar")
Function: delete object list

This function destructively removes all elements equal to object from list. It is to delq as member is to memq: it uses equal to compare elements with object, like member; when it finds an element that matches, it removes the element just as delq would. For example:

 
(delete '(2) '((2) (1) (2)))
     ⇒ '((1))

Common Lisp note: The functions member and delete in SXEmacs Lisp are derived from Maclisp, not Common Lisp. The Common Lisp versions do not use equal to compare elements.

See also the function add-to-list, in How to Alter a Variable Value, for another way to add an element to a list stored in a variable.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

11.8 Association Lists

An association list, or alist for short, records a mapping from keys to values. It is a list of cons cells called associations: the CAR of each cell is the key, and the CDR is the associated value.(1)

Here is an example of an alist. The key pine is associated with the value cones; the key oak is associated with acorns; and the key maple is associated with seeds.

 
'((pine . cones)
  (oak . acorns)
  (maple . seeds))

The associated values in an alist may be any Lisp objects; so may the keys. For example, in the following alist, the symbol a is associated with the number 1, and the string "b" is associated with the list (2 3), which is the CDR of the alist element:

 
((a . 1) ("b" 2 3))

Sometimes it is better to design an alist to store the associated value in the CAR of the CDR of the element. Here is an example:

 
'((rose red) (lily white) (buttercup yellow))

Here we regard red as the value associated with rose. One advantage of this method is that you can store other related information--even a list of other items--in the CDR of the CDR. One disadvantage is that you cannot use rassq (see below) to find the element containing a given value. When neither of these considerations is important, the choice is a matter of taste, as long as you are consistent about it for any given alist.

Note that the same alist shown above could be regarded as having the associated value in the CDR of the element; the value associated with rose would be the list (red).

Association lists are often used to record information that you might otherwise keep on a stack, since new associations may be added easily to the front of the list. When searching an association list for an association with a given key, the first one found is returned, if there is more than one.

In SXEmacs Lisp, it is not an error if an element of an association list is not a cons cell. The alist search functions simply ignore such elements. Many other versions of Lisp signal errors in such cases.

Note that property lists are similar to association lists in several respects. A property list behaves like an association list in which each key can occur only once. See section Property Lists, for a comparison of property lists and association lists.

Function: assoc key alist

This function returns the first association for key in alist. It compares key against the alist elements using equal (see section Equality Predicates). It returns nil if no association in alist has a CAR equal to key. For example:

 
(setq trees '((pine . cones) (oak . acorns) (maple . seeds)))
     ⇒ ((pine . cones) (oak . acorns) (maple . seeds))
(assoc 'oak trees)
     ⇒ (oak . acorns)
(cdr (assoc 'oak trees))
     ⇒ acorns
(assoc 'birch trees)
     ⇒ nil

Here is another example, in which the keys and values are not symbols:

 
(setq needles-per-cluster
      '((2 "Austrian Pine" "Red Pine")
        (3 "Pitch Pine")
        (5 "White Pine")))

(cdr (assoc 3 needles-per-cluster))
     ⇒ ("Pitch Pine")
(cdr (assoc 2 needles-per-cluster))
     ⇒ ("Austrian Pine" "Red Pine")
Function: rassoc value alist

This function returns the first association with value value in alist. It returns nil if no association in alist has a CDR equal to value.

rassoc is like assoc except that it compares the CDR of each alist association instead of the CAR. You can think of this as "reverse assoc", finding the key for a given value.

Function: assq key alist

This function is like assoc in that it returns the first association for key in alist, but it makes the comparison using eq instead of equal. assq returns nil if no association in alist has a CAR eq to key. This function is used more often than assoc, since eq is faster than equal and most alists use symbols as keys. See section Equality Predicates.

 
(setq trees '((pine . cones) (oak . acorns) (maple . seeds)))
     ⇒ ((pine . cones) (oak . acorns) (maple . seeds))
(assq 'pine trees)
     ⇒ (pine . cones)

On the other hand, assq is not usually useful in alists where the keys may not be symbols:

 
(setq leaves
      '(("simple leaves" . oak)
        ("compound leaves" . horsechestnut)))

(assq "simple leaves" leaves)
     ⇒ nil
(assoc "simple leaves" leaves)
     ⇒ ("simple leaves" . oak)
Function: rassq value alist

This function returns the first association with value value in alist. It returns nil if no association in alist has a CDR eq to value.

rassq is like assq except that it compares the CDR of each alist association instead of the CAR. You can think of this as "reverse assq", finding the key for a given value.

For example:

 
(setq trees '((pine . cones) (oak . acorns) (maple . seeds)))

(rassq 'acorns trees)
     ⇒ (oak . acorns)
(rassq 'spores trees)
     ⇒ nil

Note that rassq cannot search for a value stored in the CAR of the CDR of an element:

 
(setq colors '((rose red) (lily white) (buttercup yellow)))

(rassq 'white colors)
     ⇒ nil

In this case, the CDR of the association (lily white) is not the symbol white, but rather the list (white). This becomes clearer if the association is written in dotted pair notation:

 
(lily white) ≡ (lily . (white))
Function: remassoc key alist

This function deletes by side effect any associations with key key in alist--i.e. it removes any elements from alist whose car is equal to key. The modified alist is returned.

If the first member of alist has a car that is equal to key, there is no way to remove it by side effect; therefore, write (setq foo (remassoc key foo)) to be sure of changing the value of foo.

Function: remassq key alist

This function deletes by side effect any associations with key key in alist--i.e. it removes any elements from alist whose car is eq to key. The modified alist is returned.

This function is exactly like remassoc, but comparisons between key and keys in alist are done using eq instead of equal.

Function: remrassoc value alist

This function deletes by side effect any associations with value value in alist--i.e. it removes any elements from alist whose cdr is equal to value. The modified alist is returned.

If the first member of alist has a car that is equal to value, there is no way to remove it by side effect; therefore, write (setq foo (remassoc value foo)) to be sure of changing the value of foo.

remrassoc is like remassoc except that it compares the CDR of each alist association instead of the CAR. You can think of this as "reverse remassoc", removing an association based on its value instead of its key.

Function: remrassq value alist

This function deletes by side effect any associations with value value in alist--i.e. it removes any elements from alist whose cdr is eq to value. The modified alist is returned.

This function is exactly like remrassoc, but comparisons between value and values in alist are done using eq instead of equal.

Function: copy-alist alist

This function returns a two-level deep copy of alist: it creates a new copy of each association, so that you can alter the associations of the new alist without changing the old one.

 
(setq needles-per-cluster
      '((2 . ("Austrian Pine" "Red Pine"))
        (3 . ("Pitch Pine"))
        (5 . ("White Pine"))))
⇒
((2 "Austrian Pine" "Red Pine")
 (3 "Pitch Pine")
 (5 "White Pine"))

(setq copy (copy-alist needles-per-cluster))
⇒
((2 "Austrian Pine" "Red Pine")
 (3 "Pitch Pine")
 (5 "White Pine"))

(eq needles-per-cluster copy)
     ⇒ nil
(equal needles-per-cluster copy)
     ⇒ t
(eq (car needles-per-cluster) (car copy))
     ⇒ nil
(cdr (car (cdr needles-per-cluster)))
     ⇒ ("Pitch Pine")
(eq (cdr (car (cdr needles-per-cluster)))
    (cdr (car (cdr copy))))
     ⇒ t

This example shows how copy-alist makes it possible to change the associations of one copy without affecting the other:

 
(setcdr (assq 3 copy) '("Martian Vacuum Pine"))
(cdr (assq 3 needles-per-cluster))
     ⇒ ("Pitch Pine")

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

11.9 Property Lists

A property list (or plist) is another way of representing a mapping from keys to values. Instead of the list consisting of conses of a key and a value, the keys and values alternate as successive entries in the list. Thus, the association list

 
((a . 1) (b . 2) (c . 3))

has the equivalent property list form

 
(a 1 b 2 c 3)

Property lists are used to represent the properties associated with various sorts of objects, such as symbols, strings, frames, etc. The convention is that property lists can be modified in-place, while association lists generally are not.

Plists come in two varieties: normal plists, whose keys are compared with eq, and lax plists, whose keys are compared with equal,

Function: valid-plist-p plist

Given a plist, this function returns non-nil if its format is correct. If it returns nil, check-valid-plist will signal an error when given the plist; that means it's a malformed or circular plist or has non-symbols as keywords.

Function: check-valid-plist plist

Given a plist, this function signals an error if there is anything wrong with it. This means that it's a malformed or circular plist.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

11.9.1 Working With Normal Plists

Function: plist-get plist property &optional default

This function extracts a value from a property list. The function returns the value corresponding to the given property, or default if property is not one of the properties on the list.

Function: plist-put plist property value

This function changes the value in plist of property to value. If property is already a property on the list, its value is set to value, otherwise the new property value pair is added. The new plist is returned; use (setq x (plist-put x property value)) to be sure to use the new value. The plist is modified by side effects.

Function: plist-remprop plist property

This function removes from plist the property property and its value. The new plist is returned; use (setq x (plist-remprop x property)) to be sure to use the new value. The plist is modified by side effects.

Function: plist-member plist property

This function returns t if property has a value specified in plist.

In the following functions, if optional arg nil-means-not-present is non-nil, then a property with a nil value is ignored or removed. This feature is a virus that has infected old Lisp implementations (and thus E-Lisp, due to RMS's enamorment with old Lisps), but should not be used except for backward compatibility.

Function: plists-eq a b &optional nil-means-not-present

This function returns non-nil if property lists A and B are eq (i.e. their values are eq).

Function: plists-equal a b &optional nil-means-not-present

This function returns non-nil if property lists A and B are equal (i.e. their values are equal; their keys are still compared using eq).

Function: canonicalize-plist plist &optional nil-means-not-present

This function destructively removes any duplicate entries from a plist. In such cases, the first entry applies.

The new plist is returned. If nil-means-not-present is given, the return value may not be eq to the passed-in value, so make sure to setq the value back into where it came from.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

11.9.2 Working With Lax Plists

Recall that a lax plist is a property list whose keys are compared using equal instead of eq.

Function: lax-plist-get lax-plist property &optional default

This function extracts a value from a lax property list. The function returns the value corresponding to the given property, or default if property is not one of the properties on the list.

Function: lax-plist-put lax-plist property value

This function changes the value in lax-plist of property to value.

Function: lax-plist-remprop lax-plist property

This function removes from lax-plist the property property and its value. The new plist is returned; use (setq x (lax-plist-remprop x property)) to be sure to use the new value. The lax-plist is modified by side effects.

Function: lax-plist-member lax-plist property

This function returns t if property has a value specified in lax-plist.

In the following functions, if optional arg nil-means-not-present is non-nil, then a property with a nil value is ignored or removed. This feature is a virus that has infected old Lisp implementations (and thus E-Lisp, due to RMS's enamorment with old Lisps), but should not be used except for backward compatibility.

Function: lax-plists-eq a b &optional nil-means-not-present

This function returns non-nil if lax property lists A and B are eq (i.e. their values are eq; their keys are still compared using equal).

Function: lax-plists-equal a b &optional nil-means-not-present

This function returns non-nil if lax property lists A and B are equal (i.e. their values are equal).

Function: canonicalize-lax-plist lax-plist &optional nil-means-not-present

This function destructively removes any duplicate entries from a lax plist. In such cases, the first entry applies.

The new plist is returned. If nil-means-not-present is given, the return value may not be eq to the passed-in value, so make sure to setq the value back into where it came from.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

11.9.3 Converting Plists To/From Alists

Function: alist-to-plist alist

This function converts association list alist into the equivalent property-list form. The plist is returned. This converts from

 
((a . 1) (b . 2) (c . 3))

into

 
(a 1 b 2 c 3)

The original alist is not modified.

Function: plist-to-alist plist

This function converts property list plist into the equivalent association-list form. The alist is returned. This converts from

 
(a 1 b 2 c 3)

into

 
((a . 1) (b . 2) (c . 3))

The original plist is not modified.

The following two functions are equivalent to the preceding two except that they destructively modify their arguments, using cons cells from the original list to form the new list rather than allocating new cons cells.

Function: destructive-alist-to-plist alist

This function destructively converts association list alist into the equivalent property-list form. The plist is returned.

Function: destructive-plist-to-alist plist

This function destructively converts property list plist into the equivalent association-list form. The alist is returned.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

11.10 Skip Lists

Like association lists or property lists, a skip list is another dictionary data type. That is skip lists can be used to represent a finite key-value mapping. See See section Association Lists, and See section Property Lists.

However, alists and plists are very inefficient at almost any operation for large key spaces. For instance, looking up the value of a given key is accomplished by linear probing. This implies that searching for a key not in the key space is a worst case, the list has to be traversed in its entirety as it contains no superior structure or indicators which help to identify keys not belonging to the key space.

On the other hand, given an alist or plist this lack of structure induces a class of equivalent alists or plists with respect to the storage representation. This would make it possible to add new elements in constant time by just prepending them to the front of the list.

Anyway, without manual intervention SXEmacs' association and property lists cannot add elements in constant time. Instead they try to avoid duplicate keys and thence crawl the entire list to make sure the key to be added is not already there.

Function: make-skiplist

Return a new empty skiplist object.

Function: skiplistp object

Return non-nil if object is a skiplist, nil otherwise.

Function: skiplist-empty-p skiplist

Return non-nil if skiplist is empty, nil otherwise.

Function: put-skiplist skiplist key value

Add key to the skiplist and assign value. Hereby, the skiplist object is modified by side-effect.

Function: get-skiplist skiplist key &optional default

Return the value of key in skiplist. If key is not an element, return nil instead or - if specified - default.

Function: remove-skiplist skiplist key

Remove the element specified by key from skiplist. If key is not an element, this is a no-op.

Function: skiplist-owns-p skiplist key

Return non-nil if key is associated with a value in skiplist, nil otherwise.

Function: skiplist-size skiplist

Return the size of skiplist, that is the number of elements.

Function: copy-skiplist skiplist

Return a copy of skiplist skiplist. The elements of skiplist are not copied; they are shared with the original.

Function: skiplist-to-alist skiplist

Return the ordinary association list induced by skiplist.

Function: skiplist-to-plist skiplist

Return the ordinary property list induced by skiplist.

Function: alist-to-skiplist alist

Return a skiplist from alist with equal key space and image.

Function: plist-to-skiplist plist

Return a skiplist from plist with equal key space and image.

Function: skiplist-union &rest skiplists

Return the union skiplist of skiplists, that is a skiplist containing all key-value-pairs which are in at least one skiplist of skiplists.

Note: Key-value-pairs with equal keys and distinct values are processed from left to right, that is the final union for such pairs contains the value of the rightmost skiplist in skiplists.

Function: skiplist-intersection &rest skiplists

Return the intersection skiplist of skiplists, that is a skiplist containing all key-value-pairs which are in all skiplists of skiplists.

Note: Key-value-pairs with equal keys and distinct values are processed from right to left, that is the final intersection for such pairs contains the value of the leftmost skiplist in skiplists.

Function: map-skiplist function skiplist

Map function over entries in skiplist, calling it with two args, each key and value in skiplist.

function may not modify skiplist, with the one exception that function may remove or reput the entry currently being processed by function.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

11.11 Weak Lists

A weak list is a special sort of list whose members are not counted as references for the purpose of garbage collection. This means that, for any object in the list, if there are no references to the object anywhere outside of the list (or other weak list or weak hash table), that object will disappear the next time a garbage collection happens. Weak lists can be useful for keeping track of things such as unobtrusive lists of another function's buffers or markers. When that function is done with the elements, they will automatically disappear from the list.

Weak lists are used internally, for example, to manage the list holding the children of an extent--an extent that is unused but has a parent will still be reclaimed, and will automatically be removed from its parent's list of children.

Weak lists are similar to weak hash tables (see section Weak Hash Tables).

Function: weak-list-p object

This function returns non-nil if object is a weak list.

Weak lists come in one of four types:

simple

Objects in the list disappear if not referenced outside of the list.

assoc

Objects in the list disappear if they are conses and either the car or the cdr of the cons is not referenced outside of the list.

key-assoc

Objects in the list disappear if they are conses and the car is not referenced outside of the list.

value-assoc

Objects in the list disappear if they are conses and the cdr is not referenced outside of the list.

Function: make-weak-list &optional type

This function creates a new weak list of type type. type is a symbol (one of simple, assoc, key-assoc, or value-assoc, as described above) and defaults to simple.

Function: weak-list-type weak

This function returns the type of the given weak-list object.

Function: weak-list-list weak

This function returns the list contained in a weak-list object.

Function: set-weak-list-list weak new-list

This function changes the list contained in a weak-list object.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

11.12 Doubly-Linked Lists

A doubly-linked list is a mere extension of the ordinary linked list. While the latter one has an entry pointer to the first element of the list (obtainable via car) and for each element a pointer to the next element (obtainable via cdr), the doubly-linked list (dl-list for short) has both, an entry pointer to the first element and an entry pointer to the last element. These are obtainable via dllist-car and dllist-rac, respectively. Moreover, all elements of the dl-list point to the next and the previous element.

This well-known structure supports both appending and prepending elements in the same time complexity class.

Function: dllist &rest initial-elements

Return a doubly-linked list.

Optionally passed arguments are filled into the resulting dllist.

Function: dllistp object

Return non-nil if object is a dllist, nil otherwise.

Function: dllist-empty-p dllist

Return non-nil if dllist is empty, nil otherwise.

Function: dllist-size dllist

Return the size of dllist, that is the number of elements.

Function: dllist-car dllist

Return the front element of dllist.

Function: dllist-rac dllist

Return the back element of dllist.

Note: All of the following modifier functions work by side-effect.

Function: dllist-prepend dllist element

Add element to the front of dllist.

Function: dllist-append dllist element

Add element to the back of dllist.

Function: dllist-pop-car dllist

Remove the front element of dllist and return it.

Function: dllist-pop-rac dllist

Remove the back element of dllist and return it.

In box notation a dllist looks like

 
car         _______     _______     
|          |       |   |       |    --> nil
|     _____|_     _v___|_     _v___|_
 --> |_______|   |_______|   |_______| <--
       | | ^       | | ^       | |        |
nil <--  | |_______| | |_______| |        |
         v           v           v        rac
       data1       data2       data3

Let us look at some examples.

 
(setq d (dllist))
  ⇒ (dllist)
(dllistp d)
  ⇒ t

(dllist-append d 2)
  ⇒ (dllist 2)
(dllist-append d 4)
  ⇒ (dllist 2 4)
(dllist-append d 6)
  ⇒ (dllist 2 4 6)

(dllist-car d)
  ⇒ 2
(dllist-rac d)
  ⇒ 6

(dllist-prepend d (dllist-pop-rac d))
  ⇒ (dllist 6 2 4)
(dllist-append d (dllist-pop-car d))
  ⇒ (dllist 2 4 6)
(dllist-size d)
  ⇒ 3

Of course, dl-lists can be converted to ordinary lisp lists.

Function: dllist-to-list dllist

Return the ordinary list induced by dllist, that is start with the first element in dllist and traverse through the back.

Function: dllist-to-list-reversed dllist

Return the ordinary list induced by dllist in reverse order, that is start with the last element in dllist and traverse through the front.

 
(dllist-to-list d)
  ⇒ (2 4 6)
(dllist-to-list-reversed d)
  ⇒ (6 4 2)

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

11.13 Bloom Filters

The concept of a Bloom filter was introduces by Burton H. Bloom in 1970. It is a constant space complexity, probabilistic data structure that is used to do so-called membership decisions on anonymous sets. That is, you can easily decide whether a given element is a member of a certain set, whereas you cannot select elements from it (or even traverse through all elements). Moreover, like hash-tables the time complexity of Bloom filters for adding and removing elements, as well as membership-decision is in O(1) - it is in O(k) indeed where k is the degree of the Bloom filter.

Probabilistic, however, means that false positives are possible, but false negatives are not. The probability of false positives grows subexponentially with the number of added elements.

Another interesting property of Bloom filters of equal order and degree is the ability to perform set union and intersection operations within O(1) space and time complexity!

SXEmacs' Bloom filters have their own lisp-type, but they do not have a special input syntax.

Function: make-bloom &optional order degree

Return an empty bloom-filter.

Optional argument order (a positive integer) specifies the "length" of the internal filter vector, defaults to 256.

Optional argument degree (a positive integer) specifies the number of hash slices, defaults to 8.

For reasons of convenience, we also provide a constructor for a complete Bloom filter. That is a bloom filter which owns any possible element.

Function: make-bloom-universe &optional order degree

Return a complete bloom-filter.

Optional argument order (a positive integer) specifies the "length" of the internal filter vector, defaults to 256.

Optional argument degree (a positive integer) specifies the number of hash slices, defaults to 8.

Function: bloomp object

Return non-nil if object is a bloom filter, nil otherwise.

There are two further auxiliary functions to determine the bloom filter parameters:

Function: bloom-order bloom

Return the order of the bloom-filter bloom.

Function: bloom-degree bloom

Return the degree of the bloom-filter bloom.

Adding/removing elements to/from a bloom filter always works by side-effect.

Function: bloom-add bloom element

Add element to the bloom-filter bloom.

Function: bloom-remove bloom element

Remove element from the bloom-filter bloom.

The membership decision is done with bloom-owns-p.

Function: bloom-owns-p bloom element

Return non-nil if element is in the bloom-filter bloom.

 
(setq bl (make-bloom))
  ⇒ #<bloom-filter :order 256 :degree 8 :size 0>
(bloomp bl)
  ⇒ t

Now we want to add three integers and test some memberships.

 
(bloom-add bl 12)
  ⇒ 12
(bloom-add bl 21)
  ⇒ 21
(bloom-add bl 0)
  ⇒ 0

(bloom-owns-p bl 12)
  ⇒ t
(bloom-owns-p bl 21)
  ⇒ t
(bloom-owns-p bl 0)
  ⇒ t
(bloom-owns-p bl 13)
  ⇒ nil
(bloom-owns-p bl 17)
  ⇒ nil

Now let us remove some elements

 
(bloom-remove bl 12)
  ⇒ 12
(bloom-remove bl 0)
  ⇒ 12

(bloom-owns-p bl 12)
  ⇒ nil
(bloom-owns-p bl 21)
  ⇒ t
(bloom-owns-p bl 0)
  ⇒ nil

Of course, Bloom filters work with any lisp type (not just integers). Internally, they use an object's hash value and scatter it over a fixed-length array.

 
(bloom-owns-p bl "horse")
  ⇒ nil
(bloom-add bl "horse")
  ⇒ "horse"
(bloom-owns-p bl "horse")
  ⇒ t
(bloom-owns-p bl "snake")
  ⇒ nil

As remarked above, union and intersection of Bloom-filters of equal order and degree is possible in constant time regardless of the size.

Function: bloom-union &rest blooms

Return the union Bloom filter of all arguments.

Function: bloom-intersection &rest blooms

Return the intersection Bloom filter of all arguments.

 
(setq bl1 (make-bloom))
  ⇒ #<bloom-filter :order 256 :degree 8 :size 0>
(bloom-add bl1 2)
  ⇒ 2
(bloom-add bl1 4)
  ⇒ 4
(bloom-add bl1 'foobar)
  ⇒ foobar

(setq bl2 (make-bloom))
  ⇒ #<bloom-filter :order 256 :degree 8 :size 0>
(bloom-add bl2 "horse")
  ⇒ "horse"
(bloom-add bl2 "snail")
  ⇒ "snail"
(bloom-add bl2 'foobar)
  ⇒ foobar

(setq blu (bloom-union bl1 bl2))
  ⇒ #<bloom-filter :order 256 :degree 8 :size 6>
(bloom-owns-p blu 2)
  ⇒ t
(bloom-owns-p blu 4)
  ⇒ t
(bloom-owns-p blu "horse")
  ⇒ t
(bloom-owns-p blu "snail")
  ⇒ t
(bloom-owns-p blu 'foobar)
  ⇒ t

(setq bli (bloom-intersection bl1 bl2))
  ⇒ #<bloom-filter :order 256 :degree 8 :size 0>
(bloom-owns-p bli 2)
  ⇒ nil
(bloom-owns-p bli 4)
  ⇒ nil
(bloom-owns-p bli "horse")
  ⇒ nil
(bloom-owns-p bli "snail")
  ⇒ nil
(bloom-owns-p bli 'foobar)
  ⇒ t

Now we want to illustrate the extreme performance gain over lists, and compare to the performance of hash-tables. Therefore, we use a setup routine which fills a data container with 10000 distinct elements. Then we do 50000 membership tests. Let's start with ordinary lisp lists. For each of these both tests we measure the time.

 
(defun test-element ()
  "Return a random test element."
  (incf cnt))
  ⇒ test-element
 
(setq tlist (list))
  ⇒ nil
(setq cnt 0)
  ⇒ 0
(let ((start (current-btime)))
  (dotimes (i 10000)
    (setq tlist (cons (test-element) tlist)))
  (- (current-btime) start))
  ⇒ 140733

;; 50000 membership tests on tlist … take a coffee break
(setq cnt 0)
  ⇒ 0
(let ((start (current-btime)))
  (dotimes (i 50000)
    (member (test-element) tlist))
  (- (current-btime) start))
  ⇒ 154817323

Let's consider hash-tables:

 
(setq thash (make-hash-table))
  ⇒ #<hash-table size 0/29 0x1619>
(setq cnt 0)
  ⇒ 0
(let ((start (current-btime)))
  (dotimes (i 10000)
    (puthash (test-element) 'some-value thash))
  (- (current-btime) start))
  ⇒ 173210

;; 50000 membership tests on thash
(setq cnt 0)
  ⇒ 0
(let ((start (current-btime)))
  (dotimes (i 50000)
    (gethash (test-element) thash))
  (- (current-btime) start))
  ⇒ 723030

Let's finish our considerations with Bloom filters:

 
(setq tbfil (make-bloom))
  ⇒ #<bloom-filter :order 256 :degree 8 :size 0>
(setq cnt 0)
  ⇒ 0
(let ((start (current-btime)))
  (dotimes (i 10000)
    (bloom-add tbfil (test-element)))
  (- (current-btime) start))
  ⇒ 128565

;; 50000 membership tests on tbfil
(setq cnt 0)
  ⇒ 0
(let ((start (current-btime)))
  (dotimes (i 50000)
    (bloom-owns-p tbfil (test-element)))
  (- (current-btime) start))
  ⇒ 757945

Evaluating all these results, we can state that hash-tables and Bloom-filters provide equal performance. For extremely large sets, Bloom-filters may profit from the absence of resizing the underlying array.

 
;; 400000 insertions into a bloom-filter
(setq tbfil (make-bloom))
  ⇒ #<bloom-filter :order 256 :degree 8 :size 0>
(setq cnt 0)
  ⇒ 0
(let ((start (current-btime)))
  (dotimes (i 400000)
    (bloom-add tbfil (test-element)))
  (- (current-btime) start))
  ⇒ 5464420

;; 400000 insertions into a hash-table
(setq thash (make-hash-table))
  ⇒ #<hash-table size 0/29 0x1066>
(setq cnt 0)
  ⇒ 0
(let ((start (current-btime)))
  (dotimes (i 400000)
    (puthash (test-element) 'some-value thash))
  (- (current-btime) start))
  ⇒ 6392474

Furthermore, Bloom filters are able to track how often equal elements have been inserted, for hash-tables the value field may be used to implement something similar.

However, bloom filters tend to report more and more false positives the more elements you add. False negatives are never possible. That means with a sufficiently small bloom filter and a sufficiently large set of elements it is just a matter of probability when this filter will turn into a universe.

 
(setq b4 (make-bloom 256 64))
  ⇒ #<bloom-filter :order 256 :degree 64 :size 0>

;; now add 800000 numbers to it
(dotimes (i 800000)
  (bloom-add b4 i))
  ⇒ nil

;; now check this
(bloom-owns-p b4 'a-symbol-I-never-added)
  ⇒ t
(bloom-owns-p b4 "a string I never added")
  ⇒ t
(bloom-owns-p b4 [a vector I never added])
  ⇒ t

We see that b4 has turned into a quasi-universe. There might be objects which are not yet in b4, but the probability to find one is extremely small.

Given a bloom filter of degree d and size s, and given the bloom filter already contains n elements, the exact probability to get a false positive (an element which has not been added, but bloom-owns-p reports so) is:

So for our example the probability is veeery close to one, to be precise, it is:

 
0.9999...99993386493554255105254275560...
  ^^^^^^^^^^^
  87027 times

[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by Steve Youngs on September, 23 2008 using texi2html 1.76.