Breadth-first numbering: Lazy lists

In my last post, I wrote a simple functional queue as a stepping stone towards a breadth-first-numbering solution. By using an ordered and reversed list to represent the front and back of the queue, it’s cheap and easy to enqueue and dequeue items. A simple queue is fine for this toy example, but it’s inefficient, since it requires reversing the tail list whenever the head list is empty. We can do better with two improvements: lazy lists and incremental reversal.

Clojure’s lazy seqs seemed powerful and mysterious until I read through chapter 3 of SICP. Building a lazy list is based on two simple operations, delay and force:

(define-syntax delay
  (syntax-rules ()
    ((delay form) (lambda () form))))

(define force
  (lambda (delayed)
    (delayed)))

(let ((add-ones (delay (+ 1 1))))
  (check-pred procedure? add-ones)
  (check-equal? (force add-ones) 2))

Delay wraps a form in an anonymous function of no arguments. It can be stored and passed around like any other list, but won’t perform the computation “stored” inside until it’s evaluated. Force is the opposite of delay, forcing a delayed form to evaluate. From these two basics, we can build a lazy versions of cons:

(define-syntax lcons
  (syntax-rules ()
    ((lcons item items) (cons item (delay items)))))

Racket’s macro system uses syntax-case macros, which are a little different from the comma-spliced defmacro beasts you know and love from Common Lisp and Clojure. In addition to enforcing good hygiene, the syntax-case macro system works by pattern matching against syntax objects. In lcons, any form that matches the pattern (lcons item items) is mapped to (cons item (delay items)). In the delay macro above, anything matching (delay form) maps to (lambda () form). We’re still defining the ways we want to change the syntax of our program, but the transformation is applied at a different level: to syntax objects instead of raw s-expressions.

With lcons finished, it’s easy to create lazy lists:

(define llist
  (lambda (items)
    (if (null? items)
        '()
        (lcons (car items) (llist (cdr items))))))

(define lazy (llist '(1 2 3 4 5)))

(check-pred pair? lazy
            "A lazy list is a pair: the head of the list and a delayed
            function.")

(check-equal? (car lazy) 1
               "A lazy list stores its head as the first item of a
               pair.")

(check-pred procedure? (cdr lazy)
            "A lazy list stores a delayed function as the second item
            of a pair.")

(check-equal? (car ((cdr lazy))) 2
               "Evaluating the delayed function returns the next item
               in the list.")

Just as eager lists are composed of nested pairs, lazy lists are composed of nested, delayed pairs:

(define eager-list (cons 1 (cons 2 (cons 3 (cons 4 '())))))
> eager-list
'(1 2 3 4)

(define lazy-list (lcons 1 (lcons 2 (lcons 3 (lcons 4 '())))))
> '(1 . #<procedure:...me/bfs/queue.scm:106:6>)

Like a normal list, the car of each pair is the list item, and the cdr represents the rest of the list. But it doesn’t return the next pair until it’s evaluated:

> (car lazy-list)
1
> (cdr lazy-list)
#<procedure:...queue.scm:106:6>
> ((cdr lazy-list))
'(2 . #<procedure:...queue.scm:106:6>)

With this behavior in mind, we can write lazy car and lazy cdr:

(define lcar
  (lambda (llist)
    (car llist)))

(check-equal? (lcar lazy) 1
              "Lazy-car is just like regular car!")

(define lcdr
  (lambda (llist)
    (force (cdr llist))))

(check-equal? (car (lcdr lazy)) 2
              "Lazy-cdr forces evaluation of the next list element.")

A take-n function is also handy for converting lazy lists back to eager ones:

(define take-n
  (lambda (n lazy-list)
    (if (= 0 n) '()
      (cons (lcar lazy-list)
            (take-n (- n 1) (lcdr lazy-list))))))

(check-equal? (take-n 4 (llist '(1 2 3 4 5 6 7))) '(1 2 3 4)
              "Take-n returns the first n elements of a lazy list.")

And that’s it! We’ve written all the basics necessary for lazy lists. (For a few more lazy-fied list operations, see section 3.5.1 of SICP).

Finally, we should make one important optimization. Over the course of list operations like lcdr, the same delayed form can be called many times. If the delayed computation is simple, this won’t be noticeably inefficient. In our case, we’re just storing values that are immediately returned (integers in these examples, and eventually some node representation in our numbering solution). But there’s no guarantee that delayed computations will be cheap! We could put functions in a lazy list just as easily:

> (lcons (+ 1 1) (lcons (+ 1 2) (lcons (+ 1 3) (lcons (+ 1 4) '()))))
'(2 . #<procedure:...queue.scm:106:6>)

And those functions could require a lot of work:

> (lcons (read-book "Finnegans Wake")
    (lcons (read-book "In Search of Lost Time")
      (lcons (read-book "Gravity's Rainbow"))))
'("riverrun, past Eve and Adam's..." . #<procedure:...read.scm:10:5>)

In practice, we should memoize lazy computations, so subsequent calls look up their previously computed values. It’s an easy fix:

(define memoize
  (lambda (func)
    (let ((already-run? #f) (result #f))
      (lambda ()
        (if (not already-run?)
            (begin (set! result (func))
                   (set! already-run? #t)
                   result)
             result)))))

(define-syntax delay
  (syntax-rules ()
    ((delay form) (memoize (lambda () form)))))

Now that we’ve written lazy lists, we can use them to build an efficient functional queue for the breadth-first numbering problem.

Connor Mendenhall

Breadth-first Numbering: Lazy Lists

Comments