Clean Clojure: Small Functions
This is part 2 in a series on clean Clojure. Previously: meaningful names.
In Clean Code, Uncle Bob proposes two rules for good functions: “The first rule of functions is that they should be small. The second rule of functions is that they should be smaller than that.” Useful rules, but Clojure requires one more: they’re still not small enough.
Functional languages and immutable data make reasoning easy by making functions simpler. Functions take input, transform it, and return new output. Data passes through functions, flowing rather than mutating. Complicated functions make simple hard, and they can be dangerously easy to write.
Hyperbole aside, there really are two simple rules for functions: they should be small, and do one thing. In his presentation on functions, Uncle Bob describes a simple algorithm for cleaning up crufty functions:
- Pick a function.
- Extract functions until it does one thing.
- Recur on extracted functions.
Instead of a contrived wombat example, I’ll use one of my own disgusting old 4Clojure solutions as an example of atrocious code. (But cut me some slack! I was young and naive.)
Here’s the problem:
“Write a function which takes a collection of integers as an argument. Return the count of how many elements are smaller than the sum of their squared component digits. For example: 10 is larger than 1 squared plus 0 squared; whereas 15 is smaller than 1 squared plus 5 squared.”
And here’s my answer (hide the children):
1 2 3 4 5 6 7 8 9 10 |
|
Like nested blocks in other languages, code that sprawls rightward
indicates a problem—and it can happen fast in Clojure.
To start, we’ll extract lt-sqd-components
from the let
binding.
(This is a common, awful 4Clojure hack for defining a named function
inside an anonymous one, though the discerning 4Clojurist uses letfn
).
1 2 3 4 5 6 7 8 9 10 11 |
|
The original function is almost readable, but we can do better. It
looks like I didn’t understand filter
when I wrote this: the extra
map
is redundant since lt-sqd-components
is already a predicate function that
returns true
or false
.
1 2 3 4 5 6 7 8 9 10 11 |
|
This does one thing, so let’s clean it up and move on. It needs a name, and the function we’re filtering against needs a question mark.
1 2 3 4 5 6 7 8 9 10 11 12 |
|
And now the recursive step. Let’s look at the terribly-named
lt-sqd-components
. Each line in its let
binding does something
different. One splits a number into a sequence of its digits:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
One squares every element in a sequence:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
And one takes the sum of the collection.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
One more function to extract: the let
binding should be its own
function. One might argue that this function does one thing—all it
does is check whether a number is less than the sum of its squared components!
But it’s operating on several different levels of abstraction: digits,
a sequence of digits, and their sum. A helpful guideline is limiting
functions to one level of abstraction. In this case, the function
should only know about the sum.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
Despite its dumb name, lt-sqd-components?
is doing one thing. Let’s
clean it up. I prefer “digits” to “components”, and it should use defn
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
On to sum-of-squared-digits
. We can transform the let
binding into a function using the
threading macro (as suggested in the comments on my last post).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
We can do better. I don’t like the intermediate square-all
step,
which should be hidden in sum-of-squares
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
Extract the function literal in square-all
. I’ve got a great name
for it:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
And there’s only one function left: splitting a number into a sequence of digits. Let’s extract and name the function literal:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
|
And finally, clean it up by using Integer/parseInt
instead of hacky
subtraction:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|
And there it is—clean, readable functions at all levels of abstraction, minimal nesting, and nothing longer than three lines. Starting from the top, low-level functions build into bigger abstractions through combination and composition. Each step is easy to read and comprehend.
As Uncle Bob puts it in Clean Code:
Master programmers think of systems as stories to be told rather than programs to be written. They use the facilities of their chosen programming language to construct a much richer and more expressive language that can be used to tell that story. Part of that domain-specific language is the hierarchy of functions that describe all the actions that take place within that system. In an artful act of recursion, those actions are written to use the very domain-specific language they define to tell their own small part of the story.
Extract. Simplify. Recur. Take the time to consider each line, and clean code comes naturally.