My mentor Colin wrote one of the most popular posts on the 8th Light blog, on the somewhat complex process of requiring Clojure libs. It’s not a stretch to give it some of the credit for my interest in Clojure: I came close to giving up in frustration the first time I tried to import a library in a Clojure REPL. It was so easy in Python, and there I was, suddenly dealing with macros, keywords, quotes, and–heaven forfend!–all those parentheses.

For all the details on namespaces, require, and the ns macro, see Colin’s post, which is still the authoritative source. But if you’re a Python programmer looking for a quick reference, here’s the Python-Clojure Rosetta stone I went looking for the first time I deadpanned into my Clojure REPL.

Import a library

File
1
2
3
import library

library.dostuff()
1
2
3
4
(ns yourproject.core
  (:require library))

(library/dostuff)
REPL
1
2
3
4
In  [1]: import library

In  [2]: library.dostuff()
"did stuff!"
1
2
3
4
5
6
user=> (require 'library)
nil

user=> (library/dostuff)
"did stuff!"
nil

Import everything without qualification

(Note: This practice is usually as ill-advised in Clojure as it is in Python!)

File
1
2
3
from library import *

dostuff()
1
2
3
4
(ns yourproject.core
  (:require [library :refer :all]))

(dostuff)
REPL
1
2
3
4
In  [1]: from library import *

In  [2]: dostuff()
"did stuff!"
1
2
3
4
5
6
user=> (require '[library :refer :all])
nil

user=> (dostuff)
"did stuff!"
nil

Import specific functions

File
1
2
3
4
from library import dostuff, returnthings

dostuff()
returnthings()
1
2
3
4
5
(ns yourproject.core
  (:require [library :refer [dostuff returnthings]]))

(dostuff)
(returnthings)
REPL
1
2
3
4
5
6
7
In  [1]: from library import dostuff, returnthings

In  [2]: dostuff()
"did stuff"

In  [3]: returnthings()
Out [3]: { "foo": 1, "bar": 2 }
1
2
3
4
5
6
7
8
9
user=> (require '[library :refer [dostuff returnthings]])
nil

user=> (dostuff)
"did stuff"
nil

user=> (returnthings)
{ :foo 1 :bar 2 }

Import and rename a library

File
1
2
3
import library as lib

lib.dostuff()
1
2
3
4
(ns yourproject.core
  (:require [library :as lib]))

(lib/dostuff)
REPL
1
2
3
4
In  [1]: import library as lib

In  [2]: lib.dostuff()
"did stuff"
1
2
3
4
5
6
user=> (require '[library :as lib])
nil

user=> (lib/dostuff)
"did stuff"
nil

Import and rename a function

File
1
2
3
from library import dostuff as ds

ds()
1
2
3
4
(ns yourproject.core
  (:require [library :refer [dostuff] :rename {dostuff ds}]))

(ds)
REPL
1
2
3
4
In [1]: from library import dostuff as ds

In [2]: ds()
"did stuff"
1
2
3
4
5
6
user=> (require '[library :refer [dostuff] :rename {dostuff ds}])
nil

user=> (ds)
"did stuff"
nil

All together, now!

File
1
2
3
4
5
import onelibrary, anotherlibrary
import onemorelibrary as oml

from library import returnthings
from library import dostuff as ds
1
2
3
4
5
6
(ns yourproject.core
  (:require [onelibrary
             anotherlibrary
             [onemorelibrary :as oml]
             [library :refer [dostuff returnthings]
                      :rename {dostuff ds}]]))

Now that we’ve built an efficient functional queue, we can finally put it to work in a breadth-first numbering solution.

First, let’s define a few trees up front for testing. The first is the example given in the introduction of the paper. The other two are a little trickier, and the twelve-node tree is not binary.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
(define example-tree '("A" ("B" "leaf"
                                ("C" "leaf"
                                     "leaf"))
                           ("D" "leaf"
                            "leaf")))

(define five-nodes '("A" ("B" ("D" "leaf"
                                   "leaf")
                              ("E" "leaf"
                                   "leaf"))
                         ("C" "leaf"
                              "leaf")))

(define twelve-nodes '("A" ("B" "leaf")
                           ("C" "leaf"
                                ("D" ("F" "leaf"
                                          "leaf")
                                     ("G" ("I" "leaf"
                                               "leaf")
                                          "leaf")
                                     ("H" ("J" "leaf"
                                               "leaf"
                                               "leaf")
                                          ("K" "leaf")
                                          ("L" "leaf")))
                                ("E" "leaf"))))

To calculate the visit order, we can perform a simple breadth-first traversal of the tree. Now that we have a queue, the solution is pretty close to the pseudocode: Start with the root node in the queue, and assign it a number. Then recur with three new parameters: A queue with this node’s children inserted, a list with this node’s number consed on, and an incremented node number. When the queue is empty, we’re done:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
(define visit-order
  (lambda (tree)
    (define bfs-iter
      (lambda (queue visited n)
        (if (null? (rem queue)) visited
            (let ((node  (car (rem queue)))
                  (new-q (car (cdr (rem queue)))))
              (if (equal? "leaf" node)
                  (bfs-iter new-q visited n)
                  (bfs-iter (ins-items (cdr node) new-q)
                            (cons (cons (car node) n) visited)
                            (+ 1 n)))))))
  (bfs-iter (ins tree empty-q) '() 1)))

(check-equal?  (visit-order example-tree)
              '(("C" . 4) ("D" . 3) ("B". 2) ("A" . 1))
              "Visit-order searches a tree breadth-first and returns a
              list of nodes and their numbers.")

(check-equal?  (visit-order five-nodes)
              '(("E" . 5) ("D" . 4) ("C" . 3) ("B" . 2) ("A" . 1)))

(check-equal?   (visit-order twelve-nodes)
               '(("L" . 12) ("K" . 11) ("J" . 10) ("I" . 9) ("H" . 8) ("G" . 7) ("F" . 6)
                 ("E" . 5)  ("D" . 4)  ("C" . 3)  ("B" . 2) ("A" . 1))
               "Visit-order works on non-binary trees.")

It’s possible to map a visit queue back to a binary tree (check out Michael’s concise Haskell solution), but I wanted a solution that would work for all trees. In the end, I settled for performing a second depth-first traversal to label nodes. This walk-map function works like a recursive map, traversing a nested structure and applying a function to every element that’s not a list:

1
2
3
4
5
6
7
8
9
10
11
12
(define walk-map
  (lambda (func items)
    (define apply-or-map
      (lambda (item)
        (cond ((null? item) '())
              ((pair? item) (map apply-or-map item))
              (else (func item)))))
    (map apply-or-map items)))

(check-equal?  (walk-map (lambda (i) (cond ((= i 1) "one") ((= i 2) "two") ((= i 3) "three") ))
              '(1 2 (1 1 2 (3 (1 (2) 2) 1 3))))
              '("one" "two" ("one" "one" "two" ("three" ("one" ("two")"two") "one" "three"))))

Here’s a convenience function to store node order in a hash set:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
(define make-label-map
  (lambda (labels)
    (let ((label-map (make-hash)))
      (define add-labels
        (lambda (labels)
          (if (null? labels)
              label-map
              (let ((node   (car  (car labels)))
                    (number (cdr (car labels))))
                (hash-set! label-map node number)
                (add-labels (cdr labels))))))
      (add-labels labels))))

(let ((label-map (make-label-map (visit-order example-tree))))
  (check-equal? (hash-ref label-map "A") 1)
  (check-equal? (hash-ref label-map "B") 2)
  (check-equal? (hash-ref label-map "C") 4)
  (check-equal? (hash-ref label-map "D") 3))

And at long last, a solution for breadth-first numbering: calculate node order, then map over the tree to apply the labels:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
(define number-tree
  (lambda (tree)
    (let ((label-map (make-label-map (visit-order tree))))
      (walk-map (lambda (node) (if (equal? "leaf" node)
                                   "leaf"
                               (hash-ref label-map node)))
                tree))))

(check-equal? (number-tree example-tree) '(1 (2 "leaf" (4 "leaf" "leaf")) (3 "leaf" "leaf")))
(check-equal? (number-tree five-nodes) '(1 (2 (4 "leaf" "leaf") (5 "leaf" "leaf")) (3 "leaf" "leaf")))
(check-equal? (number-tree twelve-nodes) '(1
                                           (2 "leaf")
                                           (3
                                              "leaf"
                                              (4
                                                (6 "leaf"
                                                   "leaf")
                                                (7 (9     "leaf"
                                                          "leaf")
                                                   "leaf")
                                                (8 (10    "leaf"
                                                          "leaf"
                                                          "leaf")
                                                   (11    "leaf")
                                                   (12    "leaf")))
                                              (5   "leaf"))))

On the plus side, this solution generalizes to non-binary trees, and is built almost entirely out of Scheme primitives. It’s not as concise or efficient as I’d like, but I’m happy with my lazy lists and functional queue, even if the implementation is a little long. You can find an edited version of my solution here, and all the code from these posts as a Gist here.

On the way to a breadth-first numbering solution, I’ve built a simple functional queue and created lazy lists from Scheme primitives (and a couple handy macros). In this post, I’ll bring the pieces together to create an improved queue. (And yes, I promise I’ll actually start numbering trees sometime soon).

Our simple queue consisted of two lists: one for the head of the queue, and a reversed one for the tail:

1
'((1 2 3) (5 4))

Our improved queue makes two changes: lazy lists and incremental reversal. It will look like this:

1
2
'(((1 . #<procedure:...me/bfs/queue.scm:106:6>) . 3)
  ((5 . #<procedure:...me/bfs/queue.scm:106:6>) . 2))

This looks a little complicated, but just like the simple queue, it’s also a list of a head and reversed tail, with each side storing the length of the associated list. This equivalent is a little simpler:

1
'((llist '(1 2 3)) . 3) (llist '(5 4)) . 2))

To start implementing the improvements, we need to update the selectors to get the lists and lengths from both sides:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
(define empty-q (list (cons '() 0) (cons '() 0)))

(define five-items (list (cons (llist '(1 2)) 2) (cons (llist '(3 4 5)) 3)))

(define lhs-len
  (lambda (queue)
    (cdr (left-side queue))))

(check-equal? (lhs-len five-items) 2
              "Our lazy queue stores the length of the lists on each side.")

(define rhs-len
  (lambda (queue)
    (cdr (right-side queue))))

(check-equal? (rhs-len five-items) 3
              "Our lazy queue stores the length of the lists on each side.")

(define lhs-list
  (lambda (queue)
    (car (left-side queue))))

(check-equal? (lcar (lhs-list five-items)) (lcar (llist '(1 2))))

(define rhs-list
  (lambda (queue)
    (car (right-side queue))))

(check-equal? (lcar (rhs-list five-items)) (lcar (llist '(3 4 5))))

Now we can write an updated insert function:

1
2
3
4
5
6
7
8
9
10
11
(define ins
  (lambda (item queue)
    (list (left-side queue)
          (cons (lcons item (rhs-list queue))
                (+ 1 (rhs-len queue))))))

(let ((three-items (ins 3 (ins 2 (ins 1 empty-q))))
      (six-items (ins 6 (ins 5 (ins 4 (ins 3 (ins 2 (ins 1 empty-q))))))))
      (check-equal? (take-n 3 (rhs-list three-items)) '(3 2 1))
      (check-equal? (take-n 3 (rhs-list six-items)) '(6 5 4)
                    "Ins adds elements to the right side."))

Remove is a little more complicated. In the simple queue, we simply swapped and reversed the right side. We want our improved queue to avoid reversing long right side lists. The solution is incremental reversal: rebalance the queue every time an element is removed.

In Okasaki’s implementation, this is done with functions called make-queue and rotate. Below are my Scheme translations.

Rotate reverses the right side list and concatenates it to the left. It’s similar to the simple queue implementation, but it uses lazy list operators and it’s designed to work incrementally:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
(define rotate
  (lambda (left right)
    (define rotate-recur
      (lambda (left right accumulator)
        (if (null? left)
            (lcons (lcar right) accumulator)
            (lcons (lcar left) (rotate-recur (lcdr left)
                                             (lcdr right)
                                             (lcons (lcar right) accumulator))))))
        (rotate-recur left right '())))

(let ((rotated (rotate (lhs-list five-items) (rhs-list five-items))))
     (check-equal? (take-n 5 rotated) '(1 2 5 4 3)
                    "Rotate reverses the right side list and concatenates it to the left."))

Make-queue implements the incremental reversal logic. Now, we no longer wait until the head list is empty to swap and reverse. Instead, we rotate the queue as soon as the tail list contains one more element than the head. This keeps the queue balanced, and ensures that we won’t run into an expensive reversal:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
(define make-queue
  (lambda (left right)
    (if (<= (cdr right) (cdr left))
        (list left right)
        (list (cons (rotate (car left)
                            (car right))
                    (+ (cdr left) (cdr right)))
              (cons '() 0)))))

(let ((rebalanced (make-queue (left-side five-items)
                              (right-side five-items))))
 (check-equal? (take-n 5 (lhs-list rebalanced)) '(1 2 5 4 3))
 (check-equal? (rhs-list rebalanced) '()
               "Make-queue rebalances the queue when the right side is longer than the left."))

To maintain a balanced queue, we’ll want to call make-queue on insertion and removal. Here’s an improved insert, and a new remove:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
(define ins
  (lambda (item queue)
    (make-queue (left-side queue)
                (cons (lcons item (rhs-list queue))
                      (+ 1 (rhs-len queue))))))

(let ((three-items (ins 3 (ins 2 (ins 1 empty-q))))
      (six-items (ins 6 (ins 5 (ins 4 (ins 3 (ins 2 (ins 1 empty-q))))))))
  (check-equal? (take-n 3 (lhs-list three-items)) '(1 2 3))
  (check-equal? (take-n 3 (lhs-list six-items)) '(1 2 3))
  (check-equal? (take-n 3 (rhs-list six-items)) '(6 5 4)
                "Ins adds elements to the right side and 
                 rebalances if it's longer than the left."))

(define rem
  (lambda (queue)
    (if (and (null? (lhs-list queue)) (null? (rhs-list queue)))
        '()
        (list (lcar (lhs-list queue))
              (make-queue (cons (lcdr (car (left-side queue)))
                                      (- (lhs-len queue) 1))
                          (right-side queue))))))

(let ((removed (rem (ins 4 (ins 3 (ins 2 (ins 1 empty-q)))))))
  (check-equal? (car removed) 1)
  (check-equal? (take-n 2 (lhs-list (cadr removed))) '(2 3))
  (check-equal? (take-n 1 (rhs-list (cadr removed))) '(4)
                "Rem returns a pair: the element removed
                 from the queue and the new queue."))

Finally, let’s add a couple convenience functions to insert and remove multiple items:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
(define ins-items
  (lambda (items queue)
    (if (null? items)
        queue
        (ins-items (cdr items) (ins (car items) queue)))))

(let ((seven-items (ins-items '(1 2 3 4 5 6 7) empty-q)))
  (check-equal? (take-n 7 (lhs-list seven-items)) '(1 2 3 4 5 6 7)
                "Ins-items adds multiple items to the queue."))

(define rem-n
  (lambda (n queue)
    (define rem-n-iter
      (lambda (n queue items)
        (if (= 0 n)
            (cons (reverse items) queue)
            (rem-n-iter (- n 1)
                        (car  (cdr (rem queue)))
                        (cons (car (rem queue)) items)))))
    (rem-n-iter n queue '())))

(let ((remove-four (rem-n 4 (ins-items '(1 2 3 4 5 6 7) empty-q))))
  (check-equal? (car remove-four) '(1 2 3 4))
  (check-equal? (+ (lhs-len (cdr remove-four))
                   (rhs-len (cdr remove-four))) 3
                "Rem-n returns a list of removed items and the new queue."))

Next time, we’ll finally bring everything together to solve the breadth-first numbering problem.

In my last post, I wrote a simple functional queue as a stepping stone towards a breadth-first-numbering solution. By using an ordered and reversed list to represent the front and back of the queue, it’s cheap and easy to enqueue and dequeue items. A simple queue is fine for this toy example, but it’s inefficient, since it requires reversing the tail list whenever the head list is empty. We can do better with two improvements: lazy lists and incremental reversal.

Clojure’s lazy seqs seemed powerful and mysterious until I read through chapter 3 of SICP. Building a lazy list is based on two simple operations, delay and force:

1
2
3
4
5
6
7
8
9
10
11
(define-syntax delay
  (syntax-rules ()
    ((delay form) (lambda () form))))

(define force
  (lambda (delayed)
    (delayed)))

(let ((add-ones (delay (+ 1 1))))
  (check-pred procedure? add-ones)
  (check-equal? (force add-ones) 2))

Delay wraps a form in an anonymous function of no arguments. It can be stored and passed around like any other list, but won’t perform the computation “stored” inside until it’s evaluated. Force is the opposite of delay, forcing a delayed form to evaluate. From these two basics, we can build a lazy versions of cons:

1
2
3
(define-syntax lcons
  (syntax-rules ()
    ((lcons item items) (cons item (delay items)))))

Racket’s macro system uses syntax-case macros, which are a little different from the comma-spliced defmacro beasts you know and love from Common Lisp and Clojure. In addition to enforcing good hygiene, the syntax-case macro system works by pattern matching against syntax objects. In lcons, any form that matches the pattern (lcons item items) is mapped to (cons item (delay items)). In the delay macro above, anything matching (delay form) maps to (lambda () form). We’re still defining the ways we want to change the syntax of our program, but the transformation is applied at a different level: to syntax objects instead of raw s-expressions.

With lcons finished, it’s easy to create lazy lists:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
(define llist
  (lambda (items)
    (if (null? items)
        '()
        (lcons (car items) (llist (cdr items))))))

(define lazy (llist '(1 2 3 4 5)))

(check-pred pair? lazy
            "A lazy list is a pair: the head of the list and a delayed
            function.")

(check-equal? (car lazy) 1
               "A lazy list stores its head as the first item of a
               pair.")

(check-pred procedure? (cdr lazy)
            "A lazy list stores a delayed function as the second item
            of a pair.")

(check-equal? (car ((cdr lazy))) 2
               "Evaluating the delayed function returns the next item
               in the list.")

Just as eager lists are composed of nested pairs, lazy lists are composed of nested, delayed pairs:

1
2
3
4
5
6
(define eager-list (cons 1 (cons 2 (cons 3 (cons 4 '())))))
> eager-list
'(1 2 3 4)

(define lazy-list (lcons 1 (lcons 2 (lcons 3 (lcons 4 '())))))
> '(1 . #<procedure:...me/bfs/queue.scm:106:6>)

Like a normal list, the car of each pair is the list item, and the cdr represents the rest of the list. But it doesn’t return the next pair until it’s evaluated:

1
2
3
4
5
6
> (car lazy-list)
1
> (cdr lazy-list)
#<procedure:...queue.scm:106:6>
> ((cdr lazy-list))
'(2 . #<procedure:...queue.scm:106:6>)

With this behavior in mind, we can write lazy car and lazy cdr:

1
2
3
4
5
6
7
8
9
10
11
12
13
(define lcar
  (lambda (llist)
    (car llist)))

(check-equal? (lcar lazy) 1
              "Lazy-car is just like regular car!")

(define lcdr
  (lambda (llist)
    (force (cdr llist))))

(check-equal? (car (lcdr lazy)) 2
              "Lazy-cdr forces evaluation of the next list element.")

A take-n function is also handy for converting lazy lists back to eager ones:

1
2
3
4
5
6
7
8
(define take-n
  (lambda (n lazy-list)
    (if (= 0 n) '()
      (cons (lcar lazy-list)
            (take-n (- n 1) (lcdr lazy-list))))))

(check-equal? (take-n 4 (llist '(1 2 3 4 5 6 7))) '(1 2 3 4)
              "Take-n returns the first n elements of a lazy list.")

And that’s it! We’ve written all the basics necessary for lazy lists. (For a few more lazy-fied list operations, see section 3.5.1 of SICP).

Finally, we should make one important optimization. Over the course of list operations like lcdr, the same delayed form can be called many times. If the delayed computation is simple, this won’t be noticeably inefficient. In our case, we’re just storing values that are immediately returned (integers in these examples, and eventually some node representation in our numbering solution). But there’s no guarantee that delayed computations will be cheap! We could put functions in a lazy list just as easily:

1
2
> (lcons (+ 1 1) (lcons (+ 1 2) (lcons (+ 1 3) (lcons (+ 1 4) '()))))
'(2 . #<procedure:...queue.scm:106:6>)

And those functions could require a lot of work:

1
2
3
4
> (lcons (read-book "Finnegans Wake")
    (lcons (read-book "In Search of Lost Time")
      (lcons (read-book "Gravity's Rainbow"))))
'("riverrun, past Eve and Adam's..." . #<procedure:...read.scm:10:5>)

In practice, we should memoize lazy computations, so subsequent calls look up their previously computed values. It’s an easy fix:

1
2
3
4
5
6
7
8
9
10
11
12
13
(define memoize
  (lambda (func)
    (let ((already-run? #f) (result #f))
      (lambda ()
        (if (not already-run?)
            (begin (set! result (func))
                   (set! already-run? #t)
                   result)
             result)))))

(define-syntax delay
  (syntax-rules ()
    ((delay form) (memoize (lambda () form)))))

Now that we’ve written lazy lists, we can use them to build an efficient functional queue for the breadth-first numbering problem.

I spent some time this weekend (okay fine, most of Saturday and Sunday afternoon) on an exercise Michael Baker shared on our “geeks” mailing list. The problem is a functional pearl from Chris Okasaki: given a binary tree, reproduce a structurally identical tree with its nodes numbered in breadth-first order.

For example, numbering this tree:

1
2
3
4
5
6
7
8
_       a
       / \
      /   \
    b       d
   / \     / \
  .   c   .   .
     / \
    .   .

Should yield this tree:

1
2
3
4
5
6
7
8
_       1
       / \
      /   \
    2       3
   / \     / \
  .   4   .   .
     / \
    .   .

If you’ve ever solved a search problem, this might sound stupid easy. But getting the details of a functional solution right can be a challenge. As Okasaki puts it in the paper:

…I presented the problem to many other functional programmers and was continually amazed at the baroque solutions I received in reply. With only a single exception, everyone who came near a workable answer went in a very different direction from my solution right from the very beginning of the design process. I gradually realized that I was witnessing some sort of mass mental block, a communal blind spot, that was steering programmers away from what seemed to be a very natural solution.

Before you read my baroque solution, you might want to try for yourself. I’ll wait.

Although I love Clojure, using built-in queues and lazy seqs felt like cheating. So I chose to use Racket with Rackunit, and tried to use as many primitives as possible.

Breadth-first traversal is easy with a queue, but an efficient functional queue can be tricky. Consing an element to the front of a Scheme list is cheap, but appending is expensive—it requires “cdring down” over all the elements. One solution (cribbed from Okasaki himself) is to represent a queue as a pair of lists. The list on the left is the head of the queue, so elements can be popped of in O(1) time. The right side represents the rest of the elements in reverse, so elements can be pushed on to the end in constant time. Here are the first steps towards an implementation: an empty queue with left and right selectors.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
(define simple-q '((1 2 3) (6 5 4)))

(define empty-queue '(()()))

(check-equal? empty-queue '(() ())
              "An empty queue is a list containing two empty lists.")

(define right-side (lambda (queue) (car (cdr queue))))

(check-equal? (right-side simple-q) '(6 5 4)
              "The right side of a queue is the second list.")

(define left-side (lambda  (queue) (car queue)))

(check-equal? (left-side simple-q) '(1 2 3)
              "The left side of a queue is the first list.")

Inserting an item conses it on to the right-side list:

1
2
3
4
5
6
7
(define insert
  (lambda (item queue)
    (list (left-side queue) (cons item (right-side queue)))))

(check-equal? (insert 7 simple-q) '((1 2 3) (7 6 5 4))
              "Inserting an element adds it to the beginning of the
              right side list.")

To dequeue an item, “remove” it from the left side with car, and return a new queue, with the cdr of the left side list:

1
2
3
4
5
6
7
8
(define remove
 (lambda (queue)
  (list (car (left-side queue))
        (list (cdr (left-side queue)) (right-side queue)))))

(check-equal? (remove simple-q) '(1 ((2 3) (6 5 4)))
              "Removing an element returns a pair: the removed
               element and the new queue.")

When the left side is out of elements, reverse the right side list, and swap it with the left. Here’s the buildup to swap-and-reverse-car:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
(define swap
  (lambda (queue) (list (right-side queue) (left-side queue))))

(check-equal? (swap simple-q) '((6 5 4) (1 2 3))
              "The right side and left side can be swapped.")

(define reverse
  (lambda (items)
    (if (null? (cdr items))
        items
        (append (reverse (cdr items)) (list (car items))))))

(check-equal? (reverse (right-side simple-q)) (list 4 5 6)
              "A list's elements can be reversed.")

(define reverse-car
  (lambda (items)
    (cons (reverse (car items)) (cdr items))))

(check-equal? (reverse-car '((1 2) (3 4))) '((2 1) (3 4))
              "The first item in a list can be reversed.")

(define swap-and-reverse-car
  (lambda (queue) (reverse-car (swap queue))))

(check-equal? (swap-and-reverse-car '(() (6 5 4))) '((4 5 6) ())
              "Swap and reverse-car can be composed to swap sides,
              then reverse the left.")

Now we can write a dequeue function that really works:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
(define remove
   (lambda (queue)
     (if (null? (left-side queue))
         (remove (swap-and-reverse-car queue))
         (list (car (left-side queue))
               (list (cdr (left-side queue)) (right-side queue))))))

(check-equal? (remove '(() (6 5 4))) '(4 ((5 6) ()))
              "To remove an element when the left side is empty, swap
              and reverse, then try again.")

(check-equal? (remove simple-q) '(1 ((2 3) (6 5 4)))
              "Removing an element returns a pair: the removed element
              and the new queue.")

That’s all it takes to build a simple functional queue. Unfortunately, it’s not very efficient. Reversing a list is the kind of O(n) operation we built our queue to avoid in the first place, but if many more items are inserted than removed, we’ll end up reversing and swapping a lot. We can do better—and I’ll explain how in my next post.

Tonight I attended a meetup on Cascalog, a clojure DSL built on top of the Hadoop map-reduce framework. (Slides here and code here if you’d like to check it out yourself).

Running huge, complicated queries with a few lines of code was awesome, but my Hadoop installation made a lot of noise whenever I tried a query in the REPL using the ?<- query executor, printing lots of unwanted log info to stdout. (I was using Hadoop installed over homebrew instead of the readme recommendation). Fortunately, it’s easy to hush the logger a little by running queries inside cascalog.io/with-log-level. Here’s a quick two-line macro that wraps calls to ?<- in with-log-level to quiet down Hadoop:

  (require '[cascalog.io :refer [with-log-level]])
  (defmacro ?<-- [& forms] `(with-log-level :fatal (?<- ~@forms)))

For future reference, you can find a gist here.

TDD

I was puzzled. Reading raw POST requests across a network socket returned headers, but never body content. When I mentioned it to Colin and showed him the method I suspected was misbehaving, he asked me a question: “Why does that exist? Which test brought that line of code into being?”

In a perfect TDD world, this question always has a good answer. In my case, it didn’t: this was a complicated line that was twice removed from the test that “created” it. But the path forward was clear right away: write a test for this line, in isolation, the way I should have done from the start.

Good test-driven development requires a lot of restraint and self-discipline, and lines like my faulty byte reader are guaranteed to sneak in if I break one of the laws of TDD—even if it’s only a couple lines of support code that aren’t written just to pass a test, or a test that looks comprehensive really trying to cover too much at once. This question is a great way to hunt them down and fix them: ask every line of code you write to justify its existence.

As a young software engineer, I learned three variables by which to manage projects: speed, quality, and price. The sponsor gets to fix two of these variables and the team gets to estimate the third. If the plan is unacceptable, the negotiating starts. This model doesn’t work well in practice. Time and costs are generally set outside the project. That leaves quality as the only variable you can manipulate. Lowering the quality of your work doesn’t eliminate work, it just shifts it later so delays are not clearly your responsibility. You can create the illusion of progress this way, but you pay in reduced satisfaction and damaged relationships. Satisfaction comes from doing quality work. The variable left out of this model is scope. If we make scope explicit, then we have a safe way to adapt, we have a safe way to negotiate, we have a limit to ridiculous and unnecessary demands.

—Kent Beck, Extreme Programming Explained

I spent all last week working on my web server. It’s a fun project so far, filled with the joy of taking something apart and looking inside to see how it works, but it’s also been a challenge: I had a long checklist of features to implement and only a week to get them all working.

Even though I was still frantically coding on the train to work the day of my demo, I managed to check off all the boxes. Like the 100% test coverage my project reported, 100% box coverage felt great—like the satisfaction of crossing the last item off a long to-do list. But as any test-driven developer knows, even 100% test coverage can’t guarantee that a project will work. This week I learned that box coverage is the same: ticking off features is no guarantee of quality.

Sure, my server met the requirements, but much of the code wasn’t pretty, and I knew it. And though I was proud of the progress I made and looking forward to showing off my work, the demo went off the rails early on, when the server hung and crashed trying to handle concurrent connections. (If you’re thinking of using Thread.run(), you probably want Thread.start(), by the way). In an instant, all the little details I’d put effort into—nice looking directory pages, support for extra headers and obscure content types, clean request parsing under the hood— were outweighed by one big defect.

The attitude towards quality at 8th Light is clear: quality is non-negotiable, we will never ship software with known defects, and when an unknown one slips by, we’ll fix it for free. That leaves scope as the only free variable in the planning and development process. Although the scope of my web server project was already explicit, I didn’t do a good job negotiating. In retrospect, it’s clear that showing a clean, stable server that only handles GET requests is a greater accomplishment than one with extra bells and whistles that’s prone to random catastrophic failure. But it sure felt good to check off all those boxes.

I’ve learned two lessons over the last week: first, quality and stability matter most. Never sacrifice quality, and never ever tolerate unstable code. Second, renegotiating and giving feedback is part of making scope explicit. Trading off quality for features is guaranteed to be a bad bargain.

Working on my HTTP server has required me to learn a little more about the technology that makes the web work. At the outset, I had only the vaguest notion of what a socket was, but once I saw a diagram of the TCP/IP layer model, it was clear—a socket is an abstraction. (More on the model here).

I was reminded right away of the diagram in Chapter 2.1 of SICP (the section on building a rational number arithmetic system). It’s not a coincidence: enforcing abstraction between layers is one reason the Internet and TCP/IP are such powerful tools. Any system that does anything interesting is necessarily composed of smaller parts. Whether they’re functions, objects, or sentences, the way they’re put together matters. But equally important is the way they interact, and how they’re separated. (Network protocols give good advice on this, too.)

“Very few writers really know what they are doing until they’ve done it. Nor do they go about their business feeling dewy and thrilled. They do not type a few stiff warm-up sentences and then find themselves bounding along like huskies across the snow. One writer I know tells me that he sits down every morning and says to himself nicely, “It’s not like you don’t have a choice, because you do – you can either type, or kill yourself.” We all often feel like we are pulling teeth, even those writers whose prose ends up being the most natural and fluid. The right words and sentences just do not come pouring out like ticker tape most of the time. […] For me and most of the other writers I know, writing is not rapturous. In fact, the only way I can get anything written at all is to write really, really shitty first drafts.”

–Anne Lamott on shitty first drafts, from Bird By Bird

This is supposed to be a professional blog, but I hope you’ll pardon this bit of language, which comes with some of the best and dearest advice on professionalism I’ve read.

At the end of last week, I moved on to the second project of my apprenticeship: writing a simple HTTP server from scratch. Starting the project was not rapturous. I knew the basics—model the filesystem, connect over a socket, and transfer specially-formatted text back and forth—but had no idea where to start or what test to write first. I read up on sockets, browsed through the HTTP spec, and stared dumbly into IntelliJ for a while, and at long last, I started typing. What Anne Lamott calls “shitty first drafts” the TDD world calls “spikes”—short experiments, sometimes without tests, to figure out what to work on next. A spike is like a “child’s draft,” allowed to run wild and break things on condition that it will be thrown out and replaced with something decent (and well-tested).

Of course, as soon as I had a few lines of code down, the ideas started to flow (I started by parsing headers, by the way), and within an hour I had a feature-complete HTCPCP server (HTTP is still a work in progress). Another reminder that programming is craft, and writing code really is a creative act.