For the most part, test-driven development has been a breeze so far.
In addition to
enforcing
good habits, watching red tests turn green provides an extremely satisfying
dopamine kick every few minutes, all day. Testing simple objects and actions in my Tic Tac Toe game—things
like the board, player behavior, and the Minimax algorithm—was
straightforward. But testing the view and controller classes that glue
them together required a little more thought.
Writing tests required interrupting the game loop to check on the behavior of the view and
controller objects. To start, I added optional flag arguments to many
methods that would break the game loop so I could make assertions
about game state. I quickly came to realize that this was a bad
solution, and I cringed the next day when a chapter of Clean Code
described boolean flags as one of the most rancid code smells around.
I came across the concept of test
doubles
and mock objects, and got the idea right away: create fake objects
with the same methods as real ones, override their behavior, and use
them as substitutes for their more complicated counterparts in unit tests.
Or at least, I thought I got the idea. With my tests passing and my game working, I felt pretty good about my
project. But wiring up a code coverage tool showed that my view and
controller classes were only half covered by my tests. What went
wrong? As it turned out, I was testing my mocks! Here’s the pattern I was following:
Create a subclass of the object you want to test.
Override or stub out methods to return predetermined output.
Write assertions against the behavior of the mock objects.
This meant, of course, that I was never actually testing the real
objects, but only the fake ones, as revealed by the code coverage data. Worse, all the tests
that I thought showed my code was working were essentialy tautologies.
This probably appears obvious to experienced TDD practitioners, but it
was surprisingly easy to fall into this antipattern. For the record,
here’s how you should really use a mock object:
Create mocks of the objects that interact with the one you want to
test.
Override or stub out methods to return predetermined output.
Write assertions against the behavior of the real object
interacting with the test doubles.
In retrospect, this makes perfect sense. But I’ll be going over all my
tests with a careful eye tomorrow. Sometimes the green light isn’t
what it seems.
Of all the things John von
Neumann
invented (including the Minimax
theorem
behind my Tic Tac Toe AI, and the architecture of the computer it runs
on), Monte Carlo simulation is one of my favorites. Unlike some of his
other
breakthroughs,
the idea behind Monte Carlo simulation is simple: use probability
and computation to estimate solutions to hard problems.
Think of the craziest integral you’ve ever solved, and the sheaves of
notebook paper spent working out the area under that ridiculous curve.
The Monte Carlo approach is a clever hack: tack a graph of the
function to a dartboard, and start throwing darts at
random. As more and more darts hit the board, the ratio of darts that
land under the curve to total darts thrown will approximate the
proportion of the dartboard under the curve. Multiply this ratio by
the area of the dartboard, and you’ve computed the integral. As a student whose
favorite fourth grade problem solving strategy was “guess and check,”
and whose favorite tenth grade problem solving strategy was “plug it
into your TI-83,” the idea has a lot of appeal.
Wikipedia has a great example of estimating the value of
Pi
with a Monte Carlo simulation. I wrote a similar simulation in Python a couple
months ago, but wanted to check out Clojure’s Incanter library for
statistical computing and try my hand at a TDD solution.
To start, you’ll want the speclj
test framework for Clojure, which uses Rspec-like syntax and provides
a helpful autorunner. Create a new Leiningen project and add it as a dev
dependency and plugin in project.clj. Make sure you set up your spec
directory as described in the
documentation. You’ll also want to add Incanter to your project
dependencies. Here’s my final project.clj:
To start, we’ll need a way to tell whether a given point is inside the
circle. Here’s a simple test:
core_spec.clj
12345678910
(ns montecarlopi.core-spec(:require[speclj.core:refer:all][montecarlopi.core:refer:all]))(describe"in-circle?"(it"should return true for points inside the circle."(should(in-circle?[00.1])))(it"should return false for points outside the circle."(should-not(in-circle?[0.80.9]))))
Fire up the testrunner with lein spec -a, and you’ll see an
exception, since the function doesn’t exist. Time to add it to
core.clj:
The let binding pulls x and y coordinates out of a vector
representing a point. If x squared plus y squared are less
than 1, the point is inside the circle. Save and watch the
autorunner turn green.
Next, we need a way to generate points at random, with x and y values
between 0 and 1. Here’s a test:
core_spec.clj
12345678
(describe"generate-random-point"(it"should return a point with x and y values between 0 and 1."(let [[xy](generate-random-point)](should(<= x1))(should(>= x0))(should(<= y1))(should(>= y0)))))
And a dead-simple implementation (clojure.core/rand conveniently
generates random floats between 0 and 1).:
core.clj
12
(defn generate-random-point[][(rand)(rand)])
We need a lot of random points, so we’d better add a function to
generate them. Here’s a test and a function that passes:
core_spec.clj
123
(describe"generate-random-points"(it"should return the specified number of random points."(should=100(count (generate-random-points100)))))
Once we’ve generated lots of points, we’ll want to know how many lie
inside the circle. This test is a little more complicated, since it
requires some mock points to test against:
core_spec.clj
1234567891011121314151617
(describe"points-in-circle"(it"should return only points inside the circle."(let [inside[[0.00.0][0.20.3][0.10.8][0.01.0][1.00.0]]outside[[1.00.2][0.80.8][0.50.9][0.80.7][0.90.9]]points(concat insideoutside)](should=5(count (points-in-circlepoints)))(should=inside(points-in-circlepoints))(should-not=outside(points-in-circlepoints)))))
But a passing solution is as easy as using in-circle? as a filter:
In order to plot the points, we’ll need to sort them into groups. A
map should do the trick. Let’s also take this chance to
pull inside, outside, and points out of the let
binding and define them as vars so we can reuse them.
Here’s the result after a quick refactor:
core_spec.clj
1234567891011121314151617181920212223242526
(def inside[[0.00.0][0.20.3][0.10.8][0.01.0][1.00.0]])(def outside[[1.00.2][0.80.8][0.50.9][0.80.7][0.90.9]])(def points(concat insideoutside))(describe"points-in-circle"(it"should return only points inside the circle."(should=5(count (points-in-circlepoints)))(should=inside(points-in-circlepoints))(should-not=outside(points-in-circlepoints))))(describe"sort-points"(it"should return a map of correctly sorted points."(should={:insideinside:outsideoutside}(sort-pointspoints))))
This code passes, but it’s not as clear as it could be:
And now we’re just a step away from estimating Pi. We’re considering a
quarter circle inside a unit square. The area of the
square is 1, and the quarter circle 1/4 * Pi. So
multiplying by four gives us our estimate:
core_spec.clj
12345678
(describe"estimate-pi"(it"should return four times the ratio of inside to outside points."(should=2.0(estimate-pipoints)))(it"should return something close to pi when given a lot of points."(let [estimate(estimate-pi(generate-random-points70000))](should(> estimate3.13))(should(< estimate3.15)))))
The second part of this test is a little tricky. Our estimate is
probabilistic, but with enough points it should almost always generate
an estimate within range. Passing the test is easy:
All that’s left is to create an Incanter
chart. We’ll start by plotting the circle with Incanter. First, we
need to write a function. Here’s a test for points on 1 = x2 + y2:
core_spec.clj
1234567
(describe"circle"(it"should equal 1 when x is 0."(should=1.0(circle0)))(it"should equal 0 when x is 1."(should=0.0(circle1)))(it"should equal 0.866 when x is 0.5"(should="0.866"(format"%.3f"(circle0.5)))))
To write the function, make sure you refer incanter.core/sqrt to
your namespace:
At this point, writing tests gets a little hairy. The JFreeChart
object on which Incanter plots are based doesn’t
offer much
in the way of public fields to test against. But we
can at least check that the function returns a chart:
core_spec.clj
1234
(describe"draw-circle"(it"should be a JFreeChart object."(should="class org.jfree.chart.JFreeChart"(str (.getClass(draw-circle))))))
Refer incanter.charts/function-plot to plot the function:
“Experienced Lisp programmers divide up their programs differently.
As well as top-down design, they follow a principle which could be
called bottom-up design–changing the language to suit the problem.
In Lisp, you don’t just write your program down toward the language,
you also build the language up toward your program. As you’re writing
a program you may think ‘I wish Lisp had such-and-such an operator.’
So you go and write it. Afterward you realize that using the new
operator would simplify the design of another part of the program, and
so on. Language and program evolve together. Like the border between
two warring states, the boundary between language and program is drawn
and redrawn, until eventually it comes to rest along the mountains and
rivers, the natural frontiers of your problem. In the end your program
will look as if the language had been designed for it. And when
language and program fit one another well, you end up with code which
is clear, small, and efficient.”
Lisp programmers have a reputation (earned or otherwise) for considering their language of
choice uniquely powerful, capable of
extending the limits of the
world
with its special
expressiveness.
I find writing Lisp a joy, and for a long time I bought into the
mythos. It still feels unique. But I’ve come to learn that good bottom-up design is possible in any language.
For the past two weeks, I’ve been working in Java, a
language many programmers consider
Blub incarnate. (You don’t have to take my word for
it).
But even without clever macros and first-class functions, following
the rules of test-driven
development
and the principles of clean
code
feels a lot like the process of natural, iterative evolution Paul
Graham describes.
A good language can encourage bottom-up, evolutionary design, and
this is one of Lisp’s great strengths. But
writing good tests—and writing them first—can go a step further and
actually enforce it.
Writing tests first requires describing abstractions before
they exist—writing the program you want to read from the very start. Using meaningful names transforms the language you have
into the one you want. Revising after every passing test makes
simplifying design second nature. And building up a program test by
tiny test is an evolutionary process that generates clean, efficient
code, whether you’re writing Common Lisp or COBOL.
Here’s a function that returns a given game board’s winner from my
first crack at Tic Tac Toe in Clojure:
tictactoe.core/get-win
1234567891011121314151617
(defn get-win"Takes a 3x3 game board. Returns a vector [winner start middle end] of the winning player, and (row, col) grid coordinates of the three-in-a-row elements."[board](let [wins(check-for-winsboard)winner(if (= 1(first (remove nil? wins)))10)[rowcoldiag](unflattenwins)](cond (not-empty-row?row)[winner[(get-row-winrow)0][(get-row-winrow)1][(get-row-winrow)2]](not-empty-row?col)[winner[0(get-row-wincol)][1(get-row-wincol)][2(get-row-wincol)]](not-empty-row?diag)(if (= 0(get-row-windiag))[winner[00][11][22]][winner[02][11][20]]))))
Which excerpt reads more like a domain-specific language? Which
would you rather read a year from now? I don’t doubt that I could
clean up the Clojure into something just as simple and readable. But
merely using an elegant language is no guarantee of elegant design.
I saved the view components of my Tic-Tac-Toe game for last. The
functionality I need to implement for a command line application is
pretty simple: the view should be able to print game boards and
messages to the terminal, prompt for user input and pass it off to a
controller, and not much more. But doing this the TDD way was harder
than I expected. Here are the solutions I found to two testing problems.
Testing text output
Printing a board should be a one-liner, especially since I added a
toString() method to my Board objects. System.out.println(Board)
ought to work just fine. But wait—the tests come first.
Writing tests against printed terminal output isn’t tough, but it does
require some setup. System.setOut() redirects Java’s default output
to a byte stream of your choice.
First, save the default System.out (It’s a PrintStream), so you can switch back to stdout later.
Then, initialize a ByteArrayOutputStream:
Add a setUp() method with the JUnit @Before annotation—the usual
pattern to run some code before your tests. Pass System.setOut() a
UTF-8 PrintStream object constructed with the byte array (You are
using UTF-8 everywhere, right?):
Now you’re ready to write that one-liner. You can find the whole test
class as a gist here.
Testing Unicode output with Infinitest
With my test in place, I wrote a quick print method and waited for the
green flash. And waited. And waited… I’ve been using
Infinitest, an excellent continuous testing
plugin for IntelliJ and Eclipse, but it threw an assertion error, even
though the tests passed when run directly from IntelliJ.
The problem? I wasn’t using Unicode
everywhere! (Don’t say I didn’t warn you.) My boards print with Unicode box
drawing characters that
threw off Infinitest. IntelliJ runs in a UTF-8 environment by default,
but character encoding must be passed as an option to the Infinitest
JVM. Fortunately, this is another one-line
solution:
add a text file named infinitest.args in your project’s root directory with one argument per
line. In this case, just:
-D file.encoding=UTF-8
With tests back in the green, I was ready to start testing user
input—but that’s a yak shave for another day.
Like many young whippersnapper kids these days, I started using GitHub before I really understood git. The benefits of version control were obvious right away, but I stuck to the very simplest git commands for a long time, and many of them turned into habits.
This week, I’ve been working on making cleaner commits and writing better messages. Composing good commit messages is the best git habit of all, and I’ve found two simple substitutions that nudge me towards more thoughtful commits.
If you’re used to git add -u or git add -a, try:
git add -p
Instead of committing everything at once, this will walk you through the results of git diff, offering the option to stage each chunk of changed code:
@@ -34,5 +35,45 @@ public class GameTree {
+
+ public List<Node> getLeaves() {
+ List<Node> leaves = new ArrayList<Node>();
+
+ for (Node child : children) {
+ if (child.children.isEmpty()) {
+ leaves.add(child);
+ } else {
+ leaves.addAll(child.getLeaves());
+ }
+ }
+ return leaves;
+ }
Stage this hunk [y,n,q,a,d,/,K,g,e,?]?
(You can find the alphabet soup of staging options here). This is easier to read than a full diff, requires a careful review of each change in context, and makes composing small atomic commits much easier.
Then, if you’re used to git commit -m <message>, try:
git commit -F <filename>
Instead of pulling a commit message from the command line or opening an editor, this uses an existing file as your commit message. But the real trick here is to keep an editor open as you page through the previous command, and compose your commit message chunk by chunk.
There are many, manymore git workflows, including ways to commit with reckless abandon and clean up later with git rebase, but these two simple changes have helped me make much better commits just by switching a few command line flags.
My first project obeying these new commandments is a rewrite of Tic-Tac-Toe in Java, a language completely new to me. Like all new projects, I began in a wilderness of error. IntelliJ was totally unfamiliar. Type declarations felt fussy and foreign. I just wanted to map over an array but had to settle for another for loop. And yet test by test, things slowly started to work.
After three days, I can’t claim to know Java. But I can claim that I know my Java works.
The clarity and certainty that come from good tests are feelings I’ve experienced before. Over the past few years, the practices and principles of Getting Things Done have transformed the way I work, think, and act. (Have I told you about our church? Would you like to read some inspirational literature?) Here is a formulation of the basic idea that might look familiar:
Don’t spend time and effort on anything but a project’s next incomplete action.
Choose the smallest next action sufficient to accomplish something useful.
Don’t spend more time or effort than necessary to finish an incomplete action.
To the uninitiated, writing tests and making lists might seem inflexible, robotic, and a little dorky. But TODO lists and test suites are really tools that externalize uncertainty. The vague sense that I really should do that thing becomes a clear action that will be in my inbox later. The nagging fear that a method might misbehave becomes proof that it does and will always do exactly what I expect.
I skated through school and college on cleverness and a keen balance of terror. And it worked! I wrote code that seemed to do what I told it long before I started writing tests. (Tests aren’t necessary for working code: they’re sufficient to prove that code works). But to claim that other ways work just fine is to miss the point. GTD and TDD are systems that radically, ruthlessly eliminate uncertainty—not because the rules are necessary in the strictest sense, but because they are sufficient to get things done, make things work, and do it all with a clear mind.