The Java Mathematician: Computing Pascal's Triangle

One of the coolest things about the Structure and Interpretation of Computer Programs has to be the exercises, in my view. Some of them are surprisingly deep explorations of mathematical or computational ideas, and even the ones that aren't still end up being quite interesting.

As the first real post for this blog, I figured I'd start off with one of the exercises in that latter category: not as deep, but still pretty cool. This exercise involves computing the numbers of Pascal's Triangle.

[For those that are interested, in the book we're dealing with Exercise 1.12 here. My worked solution in Scheme is available in my GitHub repository. However, in this post I'll be going wild with this exercise and looking at the problem for all sorts of weird angles (and all of the code will be in the JVM languages to keep with the blog's theme.) So the Scheme solutions are still worth a look if you've read this post, and vice versa.]

So pretty much everyone has heard of Pascal's Triangle at one point or another, generally when they were first learning about binomial coefficients.

The triangle looks like this:

           1
          1 1
         1 2 1
        1 3 3 1
       1 4 6 4 1
     ...       ...

We can easily see that the numbers along the side edges of the triangle (and at the tip) are all 1's, and every other number is the sum of the two numbers that appear directly above it in the triangle.

Suppose we want to write some code to generate this triangle. Now the first difficulty we have is the vague, imprecise description of how the triangle is laid out that we have in the previous paragraph. For instance, what does it mean for the two numbers to be “directly above” the other number? From the picture it's obvious. When it comes to writing a program though, we're going to have to decide what we actually mean here.

Perhaps the most straight-forward way is to instead visualise the triangle slanted over to the left, like this:

1 1

1 2 1

1 3 3 1

1 4 6 4 1
... ...

Then if we ignore the whitespace over on the right side for the moment, we can define our notion of “above” as the following:

Given the element in the ith row, jth column of the triangle, the two elements above it are:

the element in the (i-1)th row, jth column
the element in the (i-1)th row, (j-1)th column.

So for instance, the elements above the leftmost '4' in the diagram are the '3' directly on top of it, and the '1' up and over to the left.

Of course, we have to consider that this definition doesn't quite work for the first column of the triangle, which we'll call column 0. In this case, we don't have an element in row (i-1), column (j-1); that would put the element in column -1, which doesn't make sense!

The leading diagonal doesn't work either, for a slightly different reason. As you can see from the triangle diagram, there isn't an element on top of any of the leading diagonal ones, so the element in row (i-1), column j doesn't exist.

There are different ways to handle this. One easy way that we'll use here is to treat the thing as if it were a square (or an nxn matrix) instead, where the elements above the leading diagonal are always 0:

1 0 0 0 0

1 1 0 0 0

1 2 1 0 0

1 3 3 1 0

1 4 6 4 1

If we then also rule that any element in column 0 is always a 1, which is true, then we're pretty much covered.

With all that settled, let's have some fun actually computing this thing! We'll explore three different alternatives for constructing the triangle:

using simple iteration (nested for loops, array access – sounds like a job for Java!)
using a clever data structure transformation to generate each row (this is what Scala is awesome for!)
using recursion to elegantly generate each element (which Clojure will handle beautifully.)

Iteration (the Java way)

So to start with, let's look at the iteration way. In essence we're just using the above matrix idea in its most straight-forward implementation, representing the matrix as a 2-dimensional integer array. We set all of the values to 1 in the 0th column (which conveniently completes the top row for us at the same time), and then work our way across and down, summing up to get the values in each position.

In Java, we could write this up as a simple object, PascalsTriangle, that takes the number of rows to use as a parameter and has the constructor do all of the setup work. Adding a simple printing method allows us to output the triangle in a neat form.

The class code is as follows:

Running the code creates a 5x5 Pascal's triangle and prints it as output:

1 0 0 0 0

1 1 0 0 0

1 2 1 0 0

1 3 3 1 0

1 4 6 4 1

So we see that the output we obtain from the code is exactly what we'd expect the triangle to look like given the way we've specified it.

Note in particular that this representation is quite simple and efficient due to the array access. We have a nested for loop to construct the 2-dimensional integer array, so for a given value of n (where n is how many rows we want), this process has a quadratic order of growth, ie. O(n²) time-complexity and space-complexity.

Data Structure Transform (the Scala way)

One particularly cool way to look at constructing Pascal's triangle is to expand on the matrix metaphor by considering each row at a time. It would be awesome if we could find a way to compute any row of the matrix given only the previous row. This would mean that we could start with the first row (which always looks like [1,0,0, …, 0]), and then by successively applying the transform, we could generate every other row in turn.

It turns out that there is a way to do this! The trick lies in our use of the 0's as placeholders on the right-hand side of the matrix. This means that we can take a given row, shift everything in the row over by 1 to the right (filling in with a 0 on the left), and then add this back to the original row. The result will be the next row in the triangle:

Here's a mathematical representation of what's happening.

Since x₁ is always going to be 1, this gives us exactly what we want!

So we can write a short Scala function to do exactly that. The function takes in a 'row', which we'll represent as a List of integers, and returns the new row which is the addition of the old row and the old row shifted to the right. We can even write it as a gorgeous one-liner:

So let's see it in action! We loop with this nextRow() function to create a list of all of the rows of Pascal's triangle, and then loop again to print the triangle out. This gives us the same output as the Java version:

1 0 0 0 0

1 1 0 0 0

1 2 1 0 0

1 3 3 1 0

1 4 6 4 1

This way of constructing the triangle wasn't as obvious as the Java method, but actually turned out to be a cool way of looking at the problem. Here we're actually only keeping track of one row at a time (we just happen to be storing each of the computed rows in a list), so we don't need as much space this time around – if we were only interested in computing the nth row we'd have O(n) space-complexity, which is nice.

In terms of the order of growth for time, we can note that Scala's zip operation takes O(n) time, and the map operation takes O(n) time. But in our case these operations are applied sequentially (ie. not nested), so the whole zipped.map operation is pretty much O(n) itself, which means that since it appears within that for loop that runs from 1 to n, we have O(n²) time-complexity.

Recursion (the Clojure way)

Of course, if you've looked at Exercise 1.12 in SICP, we want to generate the elements of Pascal's triangle using a recursive process. This is also a nice way to do it; the recursion handles the summation, and our base cases handle the invariants of the matrix (the 1's in the first column and the 0's above the leading diagonal.)

Looking at the problem this way, the code almost writes itself. The most important part will be our pascal function, which computes the number in a given row and column. We then simply use this function to fill in all the values in the matrix in our make-pascal-triangle function.

(This code was easily translated to Clojure from my Scheme code for the exercise here. They're both Lisp dialects, so the two languages are very similar.)

The pascal function works exactly as the problem would have it work – it first checks for whether we're at an invariant (Are we in the first column? We have the value 1, guaranteed. Are we on the leading diagonal? We have the value 1. Are we to the right of the leading diagonal? We always have a 0.).

If we don't have such a case, the function recursively calls itself to sum together the elements in the two positions above the current one. It'll then work out the values for those elements using this exact same process.

The make-pascal-triangle function then applies this procedure to calculate every single element.

(Note that this isn't particularly efficient. However if we were concerned at all, we could save time with memoization; that is, storing the results of earlier calculations and retrieving them when they're needed later by other recursive branches.)

We can then construct the triangle and print it to the output, which matches up with all of the other output we've obtained so far:

1 0 0 0 0

1 1 0 0 0

1 2 1 0 0

1 3 3 1 0

1 4 6 4 1

Here the recursive process is a particularly beautiful way of calculating isolated elements of the triangle, but trying to use this to then construct the triangle by computing each individual element is kind of inefficient.

In a hypothetical worst-case scenario, the recursive pascal function calls itself twice with each new invocation, and we can simplify things by noting that it doesn't wrap up until it reaches the left edge of the triangle, which we assume can be up to n calls away. So we can have up to O(2ⁿ) time-complexity and space-complexity here (notice that those recursive calls are delayed operations!) Combine this with the O(n²) we take to compute every single element in the triangle, and you can see that we almost have a truly inefficient O(n² x 2ⁿ) order of growth in space and time!

Of course the reality is not quite that dire. In general, recursion (in this isolated, element-by-element way) isn't what you'd use if you wanted the whole triangle; even though it is the method that is most mathematically appealing in how it matches up with the definition of the triangle's construction.

So as you can see there's always many different ways to do anything, each with their own efficiencies in terms of time and space usage, as well as aesthetic appeal. Computing the elements of Pascal's triangle is a relatively simple exercise, and yet we still managed to come up with a lot of clever approaches!

I'll definitely be exploring more of the SICP exercises in future blog posts, so stay tuned.

Saturday, January 3, 2015

Computing Pascal's Triangle

Iteration (the Java way)

Data Structure Transform (the Scala way)

Recursion (the Clojure way)

No comments:

Post a Comment