One of the coolest things about the Structure and Interpretation of Computer Programs has to be the exercises, in my view. Some of them are surprisingly deep explorations of mathematical or computational ideas, and even the ones that aren't still end up being quite interesting.
As the first real post for this blog, I
figured I'd start off with one of the exercises in that latter
category: not as deep, but still pretty cool. This exercise involves
computing the numbers of Pascal's Triangle.
[For those that are interested, in the
book we're dealing with Exercise 1.12 here. My worked solution in
Scheme is available in my GitHub repository. However, in this post
I'll be going wild with this exercise and looking at the problem for
all sorts of weird angles (and all of the code will be in the JVM
languages to keep with the blog's theme.) So the Scheme solutions are
still worth a look if you've read this post, and vice versa.]
So pretty much everyone has heard of
Pascal's Triangle at one point or another, generally when they were
first learning about binomial coefficients.
The triangle looks like this:
1 1
1 2 1
1 3 3 1
1 4 6 4 1
... ...
We can easily see that the numbers
along the side edges of the triangle (and at the tip) are all 1's,
and every other number is the sum of the two numbers that appear
directly above it in the triangle.
Suppose we want to write some code to
generate this triangle. Now the first difficulty we have is the
vague, imprecise description of how the triangle is laid out that we
have in the previous paragraph. For instance, what does it mean for
the two numbers to be “directly above” the other number? From the
picture it's obvious. When it comes to writing a program though,
we're going to have to decide what we actually mean here.
Perhaps the most straight-forward way
is to instead visualise the triangle slanted over to the left, like
this:
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
... ...
... ...
Then if we ignore the whitespace over
on the right side for the moment, we can define our notion of “above”
as the following:
Given the element in the ith row, jth
column of the triangle, the two elements above it are:
- the element in the (i-1)th row, jth column
- the element in the (i-1)th row, (j-1)th column.
So for instance, the elements above the
leftmost '4' in the diagram are the '3' directly on top of it, and
the '1' up and over to the left.
Of course, we have to consider that
this definition doesn't quite work for the first column of the
triangle, which we'll call column 0. In this case, we don't have an
element in row (i-1), column (j-1); that would put the element in
column -1, which doesn't make sense!
The leading diagonal doesn't work
either, for a slightly different reason. As you can see from the
triangle diagram, there isn't
an element on top of any of the leading diagonal ones, so the element
in row (i-1), column j doesn't exist.
There are different ways to handle this. One easy way that we'll use here is to treat the thing as if
it were a square (or an nxn matrix) instead, where the elements above the leading
diagonal are always 0:
1 0 0 0 0
1 1 0 0 0
1 2 1 0 0
1 3 3 1 0
1 4 6 4 1
If we then also rule that any element
in column 0 is always a 1, which is true, then we're pretty much
covered.
With all that settled, let's have some
fun actually computing this thing! We'll explore three different
alternatives for constructing the triangle:
- using simple iteration (nested for loops, array access – sounds like a job for Java!)
- using a clever data structure transformation to generate each row (this is what Scala is awesome for!)
- using recursion to elegantly generate each element (which Clojure will handle beautifully.)
Iteration (the Java way)
So to start with, let's look at the
iteration way. In essence we're just using the above matrix idea in
its most straight-forward implementation, representing the matrix as
a 2-dimensional integer array. We set all of the values to 1 in the
0th column (which conveniently completes the top row for us at the
same time), and then work our way across and down, summing up to get
the values in each position.
In Java, we could write this up as a
simple object, PascalsTriangle, that takes the number of rows to use
as a parameter and has the constructor do all of the setup work.
Adding a simple printing method allows us to output the triangle in a
neat form.
The class code is as follows:
Running the code creates a 5x5 Pascal's
triangle and prints it as output:
1 0 0 0 0
1 1 0 0 0
1 2 1 0 0
1 3 3 1 0
1 4 6 4 1
So we see that the output we obtain
from the code is exactly what we'd expect the triangle to look like
given the way we've specified it.
Note in particular that this
representation is quite simple and efficient due to the array access.
We have a nested for loop to construct the 2-dimensional integer
array, so for a given value of n (where n is how many rows we want),
this process has a quadratic order of growth, ie. O(n2)
time-complexity and space-complexity.
Data Structure Transform (the Scala way)
One particularly
cool way to look at constructing Pascal's triangle is to expand on
the matrix metaphor by considering each row at a time. It would be
awesome if we could find a way to compute any row of the matrix
given only the previous row. This would mean that we could start with
the first row (which always looks like [1,0,0, …, 0]), and then by
successively applying the transform, we could generate every other
row in turn.
It turns out that there is a way to do
this! The trick lies in our use of the 0's as placeholders on the
right-hand side of the matrix. This means that we can take a given
row, shift everything in the row over by 1 to the right (filling in
with a 0 on the left), and then add this back to the original row.
The result will be the next row in the triangle:
Here's a mathematical representation of what's happening. |
Since
x1 is always going to be 1, this
gives us exactly what we want!
So we can write a short Scala function
to do exactly that. The function takes in a 'row', which we'll
represent as a List of integers, and returns the new row which is the
addition of the old row and the old row shifted to the right. We can
even write it as a gorgeous one-liner:
So let's see it in action! We loop with
this nextRow() function to create a list of all of the rows of
Pascal's triangle, and then loop again to print the triangle out. This gives us the same output as the
Java version:
1 0 0 0 0
1 1 0 0 0
1 2 1 0 0
1 3 3 1 0
1 4 6 4 1
This way of constructing the triangle
wasn't as obvious as the Java method, but actually turned out to be a
cool way of looking at the problem. Here we're actually only keeping
track of one row at a time (we just happen to be storing each of the
computed rows in a list), so we don't need as much space this time
around – if we were only interested in computing the nth row we'd
have O(n) space-complexity, which is nice.
In terms of the order of growth for
time, we can note that Scala's zip operation takes O(n) time, and the
map operation takes O(n) time. But in our case these operations are applied sequentially (ie. not nested),
so the whole zipped.map operation is pretty much O(n) itself, which
means that since it appears within that for loop that runs from 1 to
n, we have O(n2) time-complexity.
Recursion (the Clojure way)
Of course, if you've looked at Exercise
1.12 in SICP, we want to generate the elements of Pascal's triangle
using a recursive process. This is also a nice way to do it; the
recursion handles the summation, and our base cases handle the
invariants of the matrix (the 1's in the first column and the 0's
above the leading diagonal.)
Looking at the problem this way, the
code almost writes itself. The most important part will be our pascal
function, which computes the number in a given row and column. We
then simply use this function to fill in all the values in the matrix
in our make-pascal-triangle function.
(This code was easily translated to
Clojure from my Scheme code for the exercise here. They're both Lisp
dialects, so the two languages are very similar.)
The pascal function works exactly as
the problem would have it work – it first checks for whether we're
at an invariant (Are we in the first column? We have the value 1,
guaranteed. Are we on the leading diagonal? We have the value 1. Are
we to the right of the leading diagonal? We always have a 0.).
If we don't have such a case, the
function recursively calls itself to sum together the elements in the
two positions above the current one. It'll then work out the values
for those elements using this exact same process.
The make-pascal-triangle function then applies
this procedure to calculate every single element.
(Note that this isn't particularly
efficient. However if we were concerned at all, we could save time
with memoization; that is, storing the results of earlier
calculations and retrieving them when they're needed later by other
recursive branches.)
We can then construct the triangle and
print it to the output, which matches up with all of the
other output we've obtained so far:
1 0 0 0 0
1 1 0 0 0
1 2 1 0 0
1 3 3 1 0
1 4 6 4 1
Here the recursive process is a
particularly beautiful way of calculating isolated elements of the
triangle, but trying to use this to then construct the triangle by
computing each individual element is kind of inefficient.
In a hypothetical worst-case scenario, the recursive pascal function calls itself twice with each new invocation, and we can simplify things by noting that it doesn't wrap up until it reaches the left edge of the triangle, which we assume can be up to n calls away. So we can have up to O(2n) time-complexity and space-complexity here (notice that those recursive calls are delayed operations!) Combine this with the O(n2) we take to compute every single element in the triangle, and you can see that we almost have a truly inefficient O(n2 x 2n) order of growth in space and time!
In a hypothetical worst-case scenario, the recursive pascal function calls itself twice with each new invocation, and we can simplify things by noting that it doesn't wrap up until it reaches the left edge of the triangle, which we assume can be up to n calls away. So we can have up to O(2n) time-complexity and space-complexity here (notice that those recursive calls are delayed operations!) Combine this with the O(n2) we take to compute every single element in the triangle, and you can see that we almost have a truly inefficient O(n2 x 2n) order of growth in space and time!
Of course the reality is not quite that
dire. In general, recursion (in this isolated, element-by-element way) isn't what you'd use if you wanted the
whole triangle; even though it is the method that is most
mathematically appealing in how it matches up with the definition of
the triangle's construction.
So as you can see there's always many
different ways to do anything, each with their own efficiencies in
terms of time and space usage, as well as aesthetic appeal. Computing the elements of Pascal's triangle is a relatively simple exercise, and yet we still managed to come up with a lot of clever approaches!
I'll definitely be exploring more of the SICP exercises in future blog posts, so stay tuned.
I'll definitely be exploring more of the SICP exercises in future blog posts, so stay tuned.
No comments:
Post a Comment