CS 3723
 Programming Languages 
   Lisp Internal Representation  
  and Dotted Pairs  


Lisp S-expressions:
  • A lisp S-expression S is defined recursively:
      An S-expression S is either
      • an atom, or
      • a parenthesized list of zero of more S-expressions.


Internal Representation of S-expressions:
  • The internal representation of a Lisp S-expression is in linked list form, and uses a data structure called a lisp cell, which is either an atom (actually a pointer to an atom), or a pair of pointers to lisp cells. These pointers are called the car and cdr pointers. The internal representation is like a binary tree with no data at the internal nodes, but only data at the leaves.

  • Here are some diagrams of internal representations, using lisp cells:

    Lisp Cells: Internal Representation of S-expressions
    S-expression    Internal Representation
    
    
    a:         ---> a
    
    
    nil or (): ---> nil  (actually just a null pointer)
    
    
                  +---+---+
    (a):      --->| o | o-|---> nil
                  +---+---+
                    |
                    v
                    a
    
    
                  +---+---+    +---+---+
    (a b):    --->| o | o-|--->| o | o-|---> nil
                  +---+---+    +---+---+
                    |            |
                    v            v
                    a            b
    
    
                  +---+---+    +---+---+    +---+---+
    (a b c):  --->| o | o-|--->| o | o-|--->| o | o-|---> nil
                  +---+---+    +---+---+    +---+---+
                    |            |            |
                    v            v            v
                    a            b            c
    
    
                      +---+---+    +---+---+    +---+---+
    (a (b c ) d): --->| o | o-|--->| o | o-|--->| o | o-|---> nil
                      +---+---+    +---+---+    +---+---+
                        |            |            |
                        v            |            v
                        a            |            d
                                     |
                                     v
                                   +---+---+    +---+---+
                                   | o | o-|--->| o | o-|---> nil
                                   +---+---+    +---+---+
                                     |            |
                                     v            v
                                     b           c
    

  • Now look at the representation for (a b c) above.
    The main pointer points to the first (leftmost) lisp cell.
    The car pointer of this lisp cell points to a, while the cdr pointer points to (b c).
    Thus (car '(a b c)) is a, and (cdr '(a b c)) is (b c).

  • Now suppose we want to create the result of the expression: (cons 'a '(b c)).
    To do this, take the representations for a and for (b c).
    Create a new lisp cell, stick a pointer to the first representation above in as the car pointer,
    and stick a pointer to the second representation above in as the cdr pointer.
    The result is the representation for (a b c).

  • Above, where ---> nil is written, this really means that the cdr pointer is just an actual null pointer.


Dotted Pair Notation in Lisp:
  • Consider the Lisp expression (cons 'a 'b).
    This looks like it should be an error, but in fact, Lisp gives:

    > (cons 'a 'b)
    (A . B)
    > (car '(a . b))
    A
    > (cdr '(a . b))
    B
    

    This is the dotted pair notation in Lisp, something you haven't seen before.

  • Internally, this is represented as follows:

                  +---+---+
    (a . b):  --->| o | o-|---> b
                  +---+---+
                    |
                    v
                    a
    

  • The dotted pair notation shows the lisp cells more explicitly. Consider the following interactive session in Lisp:

    > '(a . b)
    (A . B)
    > '(a . nil)
    (A)
    > '(a . (b . nil))
    (A B)
    > '(a . (b . (c . nil)))
    (A B C)
    > '(a . ((b . (c . nil)) . (d . nil)))
    (A (B C) D)
    > (cons 'a 'nil)
    (A)
    > (cons 'a (cons 'b 'nil))
    (A B)
    > '(a . ((b . nil) . c))
    (A (B) . C)
    

  • Two points about the dotted notation:
    • There must always be a space between a dot and an atom.
      (Just as there must be a space between two atoms.)
    • In displaying S-expressions, Lisp uses as few dots as possible. (Illustrated especially with the last item above.)

Revision date: 2013-03-05. (Please use ISO 8601, the International Standard Date and Time Notation.)