Solving Project Euler problem 2 in Clojure

Here’s a Clojure solution to problem 2 from Project Euler (use tag euler to see solutions to some of the other problems):

; The following problem is taken from Project Euler.
; http://projecteuler.net/index.php?section=problems&id=2

; Problem:
; Each new term in the Fibonacci sequence is generated by adding the previous two terms. 
; By starting with 1 and 2, the first 10 terms will be:
;
; 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...
;
; By considering the terms in the Fibonacci sequence whose values do not exceed four million, 
; find the sum of the even-valued terms.

(ns com.icyrock.clojure.euler.problem2
  (:use clojure.test))

(defn fibs 
  ([] (fibs 1 2))
  ([a b] (lazy-seq (cons a (fibs b (+ a b))))))

(deftest fibs-test
  (is (= (take 10 (fibs)) '(1 2 3 5 8 13 21 34 55 89))))

(run-tests)

(defn solution []
  (println (reduce + (filter #(= 0 (mod % 2)) (take-while (partial > 4e6) (fibs))))))
(solution)

Let’s dissect the main points there. First, the test

(deftest fibs-test
  (is (= (take 10 (fibs)) '(1 2 3 5 8 13 21 34 55 89))))

(run-tests)

This is the test as outlined in the problem. The above just takes first 10 elements of sequence generated by fibs function and compares to the expected result.

fibs function looks like this:

(defn fibs 
  ([] (fibs 1 2))
  ([a b] (lazy-seq (cons a (fibs b (+ a b))))))

There are two very important features of Clojure in play here:

  • Destructuring, a very good explanation here
  • Lazy sequences, excellent answer from mikera

Destructuring in the above case is used in a way you would use default parameters in some languages. The above just says:

  • When no arguments are passed, run (fibs 1 2), essentially defaulting the parameters to 1 and 2
  ([] (fibs 1 2))
  • For all other cases (including the one from above – the default parameters case), expect two parameters a and b and run this:
  ([a b] (lazy-seq (cons a (fibs b (+ a b))))))
  • Note that you are allowed to call fibs only in the above two cases – running it in any different way – e.g. (fibs 1 2 3) will result in an exception such as:
Wrong number of args (3) passed to: problem2$fibs
  [Thrown class java.lang.IllegalArgumentException]

Lazy sequences are a powerful abstraction. This is not new to Clojure – e.g. Haskell is a programming language that is lazy by default. However, Clojure being a JVM language offers something new in that area.

The idea behind the lazy sequence is to do two things – evaluate only when the result is actually needed and cache the result for future invocations. Caching is enabled by Clojure’s immutability by default – in general, Clojure data structures are immutable.

In our case:

  ([a b] (lazy-seq (cons a (fibs b (+ a b))))))

What happens is that when fibs is called with two arguments, a lazy sequence is returned and no evaluation of it’s inner parts – i.e. the following part:

(cons a (fibs b (+ a b)))

occurs at this time. The evaluation is triggered whenever fibs is called. For example, we can do something like this in REPL:

user> (ns com.icyrock.clojure.euler.problem2)
com.icyrock.clojure.euler.problem2> (take 5 (fibs))
(1 2 3 5 8)

What happens here is that take function actually asks fibs to produce results. At this point, lazy-seq’s inner part is called and, replacing the starting parameters, we get this:

(cons 1 (fibs 2 (+ 1 2)))

What we get is a cons. When you say:

(cons a bs)

you basically have a two-cell data structure. The first cell contains a and the second cell contains bs. Note that bs is a sequence – in other Lisps, this was usually denoted by adding an “s” to the name, to make it look like a plural form of “b”. This is the basis for a list – a list is a list of cons. As an example:

(list 1 2 3 4)

is logically much the same as:

(cons 1 (cons 2 (cons 3 (cons 4 nil))))

taking into account that nil is the same as the empty list or ‘().

To return to our example:

(cons 1 (fibs 2 (+ 1 2)))

gives us a cons of 1 and (fibs 2 (+ 1 2)). Given the previous discussion, you can look at this as a list which has elements 1 and then all elements of (fibs 2 (+ 1 2)), whatever they are. Note, however, one important thing here – fibs will again return a lazy sequence. So, if you only wanted the first element of the list – e.g.

com.icyrock.clojure.euler.problem2> (take 1 (fibs))
(1)

this is all that would be evaluated. The next call to fibs – (fibs 2 (+ 1 2)) would not occur at this time.

Now, we did take of 5, so we are going to need more elements. And what are they? Well, the call is (fibs 2 (+ 1 2)) which is the same as (fibs 2 3) after the sum is calculated. Expanding that in the same way we get:

(cons 2 (fibs 3 (+ 2 3)))

So our list is basically 1, 2 and then whatever (fibs 3 (+ 2 3)) returns. And so on until we figure out the first 5 elements. After that, we would be left with a lazy sequence as the second part of our last cons which would not be evaluated.

There are two huge benefits here. First, it’s obvious that we evaluate only what we need, so we don’t waste either time or memory (and possibly other resources) evaluating unnecessary parts of the sequence.

Even more important is the fact that we have an infinite list. fibs can go on and on and will never finish, but being lazy allows us to handle that.

At the end, we do what we really need to arrive at the result we are looking for:

(defn solution []
  (println (reduce + (filter #(= 0 (mod % 2)) (take-while (partial > 4e6) (fibs))))))
(solution)

A lot of nesting here – let’s break this one down inside-out:

  • Calling:
(fibs)

will lazily generate all Fibonacci numbers

  • Let’s first examine:
(partial > 4e6)

What this does is make a function that when passed an argument returns whether 4e6 is greater then the argument passed in. This is the same as:

#(> 4e6 %)
  • This part:
(take-while (partial > 4e6) (fibs))

is a simple and powerful concept for filtering lists.

  • What it does is following:
    • Take the input sequence (let’s call it I) – whatever (fibs) returns, and it’s Fibonacci list in this case
    • Start generating a new sequence (let’s call it S) from the beginning of the sequence
    • S is generated by adding the members of I one by one while the given function is true
    • Function in this case is (partial > 4e6), so what we essentially achieve here is generate a sequence of numbers not greater then 4 million, as per the problem requirement. Let’s call this list L4 going forward
  • In the next step we have
(filter #(= 0 (mod % 2)) L4)
  • What filter does is:
    • Go through the given list – L4 in this case and return a list of the elements that satisfy the given function
    • The function in this case is #(= 0 (mod % 2)), which basically gives us elements that are evenly (with no remainder) divisible by 2
    • Let’s call this list L2
  • The next part:
(reduce + L2) 

is again a nice function

  • What it does:
    • Let’s assume we have a list l = (a1, a2, a3, a4, a5, …, aN)
    • What (reduce f l) basically does is this:
      • r1 = (f a1 a2)
      • r2 = (f r1 a3)
      • r3 = (f r2 a4)
      • rN-1 = (f rN-2 aN)
      • rN-1 is the result returned by (reduce f l)
    • In our case, what it would do is basically apply a sum (+) function to the list, effectively giving us a sum of the numbers in the list

Finally, we print the result and that’s what we use to solve the problem.