On first glance, prime numbers seem pretty trivial; and yet, most of their more interesting properties still elude us. These numbers have been bugging mathematicians for as long as we've cared about them. All sorts of questions are still completely open:
- Every now and then we find a pair of primes that are separated by a single even number – does that keep happening forever?
(That
last one is literally a million-dollar question, by the way.)
From a computer science standpoint though, the fact that we know so little about
prime numbers is actually kind of useful. If I don't even know how
the primes are distributed, I can't easily factorise a massive
composite number into its prime constituents; not even if I used a
computer. However the reverse problem – taking the primes and
multiplying them together to get to the composite number – is much
easier. Such 'one-way functions' make for awesome cryptographic
algorithms, which are absolutely vital for digital security these
days.
We're
not going to worry too much about that end of things though, and
instead consider the problem of determining whether a given integer
is prime. (Inspired by the SICP
exercises as always: the relevant Scheme solutions are 1.22, 1.24,
1.27 and 1.28.)
We'll
be using good old Java for our programming language today as we
explore three different ways to determine the primality of a number:
- an exhaustive search of all the possible candidates,
- the Fermat test, which uses Fermat's Little Theorem in a probabilistic primality test,
- the Miller-Rabin test, a variant of the Fermat test that fixes one particular deficiency.
Our Common Test Interface
Now if we're using Java then we have ready access to all of the awesome techniques of the object-oriented paradigm. So first, we should decide how we want our classes and objects to look.
Given that we'll be looking at three methods that all work to
determine primality of a given integer, there ought to be some common
behaviours that we can abstract away.
So let's construct a base class PrimalityTest. There are three main
things our class will need to take into account:
- We don't want to be too restrictive in regards to how big our primes are allowed to get. So we'll use Java's BigInteger class and the arbitrary-precision integers that it affords us, and write everything in the interface to work with this representation.
- Each of the tests should have a method, isPrime(), that signals whether a given number was found to be prime or not according to the test. The implementation of the method will of course be dependent on the test that is run, but we can declare the method signature here.
- And given that method, we can have a method timedPrimeTest() in this base class that times the test for primality and outputs the results in a pretty way. (This is a small example of the Template Method design pattern. The base class implements this method in its entirety, but the details of the testing procedure are handled by the derived classes.)
This base class for the tests is declared as follows:
The Exhaustive Prime Search
The simplest algorithm to solve a given problem is usually the one
that enumerates the entire solution domain, compares each of them,
and returns the best answer. This is the exhaustive search
algorithm. The upside is that we will have 100% certainty that we've
got the result we want (since we checked them all.) The
downside is the efficiency of such an approach – often the solution
domain is prohibitively huge, and the algorithm can find itself
churning away for truly ridiculous amounts of time.
That said, it's as good a place to start as any. In fact even with
the slightly-awkward semantics of the BigInteger class, the code
pretty much writes itself here. We inherit from our PrimalityTest
class, so we've got the timedPrimeTest() method already. All we need
to do is provide an implementation for the isPrime() method.
In our case, to check whether a given number is prime, we simply
check all of the potential factors (or divisors) one by one. Is 2 a
factor? No.. then is 3 a factor? No.. then is 4 a factor? … and so
on. Note that we only have to search up to the square root of the
given number for factors (since mathematically, if there was a factor
greater than the square root, then there would have to be another
factor less than the square root which multiplies with it.)
The code for our ExhaustivePrimeSearch class is then complete:
We can now use this class to run some timed tests for primality:
java ExhaustivePrimeSearch 1234514132413241
//> Running test: ExhaustivePrimeSearch
//> n: 1234514132413241
//> Prime?: true
//> Time taken: 11129 milliseconds.
java ExhaustivePrimeSearch 1234567654321
//> Running test: ExhaustivePrimeSearch
//> n: 1234567654321
//> Prime?: false
//> Time taken: 5 milliseconds.
As you can see, it takes a lot longer to determine that a number is prime, rather than that it isn't. This makes sense: if a number isn't prime then it has a factor, and we stop any further checking once we've found it.
So our class here will always report whether a given number is prime
or not correctly (for numbers greater than 1, at least.) However as
the size of the number increases, the time taken to make this
determination increases in general. We have O(√n)
time here; to determine whether n is prime, we need to check at most √n
– 1 numbers and see whether they are factors. This sounds good, but
remember that prime numbers can be as large as we want; and the more
interesting ones definitely will be.
The space-complexity is assumed to be O(1) – at each point we only
need room for one more variable. (Note though that if we were to be
completely rigorous, our use of arbitrary-precision integers would
affect both of our complexity calculations here.)
The Fermat Test
To save time, we can use probabilistic tests for primality instead.
This time around, instead of being certain about determining a prime
number, we are almost certain.
Our tests will give us a certainty that a number is not prime
(that is, if a number ever fails the test, we are 100% sure that it
isn't prime), but we may have false positives: numbers that aren't
actually prime that were nonetheless reported to be prime. To
increase our certainty we can run more iterations of the test; thus
trading running-time for accuracy (in general, this technique of
iterative improvement is used in all sorts of algorithms.)
The algorithm for the Fermat test works using Fermat's Little
Theorem:
All that notation means is that an is congruent to a modulo n – in other words, n divides evenly into an-a.
Given this, notice that when n is not prime, we should be able to find an a such that this congruence does not hold. So our algorithm does exactly that; we randomly pick a number a between 1 and n, and test the congruence. If it doesn't hold then we definitely don't have a prime number in n. If it does hold though, then we have some evidence that n might be prime.
Given this, notice that when n is not prime, we should be able to find an a such that this congruence does not hold. So our algorithm does exactly that; we randomly pick a number a between 1 and n, and test the congruence. If it doesn't hold then we definitely don't have a prime number in n. If it does hold though, then we have some evidence that n might be prime.
We can repeat this test as much as we want. If n passes the test for
all a < n, then we're pretty much guaranteed that n is prime (*
with a caveat! More on that later...)
In terms of our code, this will be another class derived from the
PrimalityTest interface. As before, we provide the implementation for
the isPrime() method. But since we also need to know how many tests
to run, we have a constructor that takes this amount as an input when
the object is created. Then our isPrime() method simply repeats the
Fermat test the given amount of times. The code for the Fermat test
is encapsulated in the separate passesFermatTest() method.
Now since we might want more deterministic behaviour (ie. for testing
purposes), we add in a special passesAllFermatTests() method that
runs the Fermat test with all possible choices of a (where 1 < a <
n).
(As the passesFermatTest() and passesAllFermatTests() methods are
useful on their own and don't need the number of tests to do their
job, I've made them static here, too.)
So here's the code for our FermatTest class:
So here's the code for our FermatTest class:
We can then use this class to run some timed primality tests in
precisely the same way as before with the exhaustive search code.
We'll use 20 tests for each given number (this won't give us too much
certainty, but keep in mind that the more tests we run, the longer it
will take!)
java FermatTest 1234514132413241
//> Running test: FermatTest
//> n: 1234514132413241
//> Prime?: true
//> Time taken: 8 milliseconds.
java FermatTest 1234567654321
//> Running test: FermatTest
//> n: 1234567654321
//> Prime?: false
//> Time taken: 1 milliseconds.
We can handle larger numbers too in basically no time at all (the prime number was hand-picked from the list here)
//> Running test: FermatTest
//> n: 1234514132413241
//> Prime?: true
//> Time taken: 8 milliseconds.
java FermatTest 1234567654321
//> Running test: FermatTest
//> n: 1234567654321
//> Prime?: false
//> Time taken: 1 milliseconds.
We can handle larger numbers too in basically no time at all (the prime number was hand-picked from the list here)
java FermatTest 6075380529345458860144577398704761614649
//> Running test: FermatTest
//> n: 6075380529345458860144577398704761614649
//> Prime?: true
//> Time taken: 41 milliseconds.
java FermatTest 2738264293843432410989878979887787232331
//> Running test: FermatTest
//> n: 2738264293843432410989878979887787232331
//> Prime?: false
//> Time taken: 2 milliseconds.
//> Running test: FermatTest
//> n: 6075380529345458860144577398704761614649
//> Prime?: true
//> Time taken: 41 milliseconds.
java FermatTest 2738264293843432410989878979887787232331
//> Running test: FermatTest
//> n: 2738264293843432410989878979887787232331
//> Prime?: false
//> Time taken: 2 milliseconds.
The running time of these tests is tricky to calculate since it will be based directly on the time it takes the modPow() function to do its job. Given that algorithms exist that can perform the task in O(log n) time though, we will definitely have an improvement upon the exhaustive search method at any rate.
For those instances where we want to be exactly certain, we can use
the passesAllFermatTests() method to run all of the tests for us. We
can see that here in the following output:
// Not prime: 50688 = 123 * 456
FermatTest.passesAllFermatTests(BigInteger.valueOf(50688));
//> false
// is prime:
FermatTest.passesAllFermatTests(BigInteger.valueOf(13451));
//> true
// is prime:
FermatTest.passesAllFermatTests(BigInteger.valueOf(34217));
//> true
// isn't prime... 29341 = 13 * 2257
FermatTest.passesAllFermatTests(BigInteger.valueOf(29341));
//> true
FermatTest.passesAllFermatTests(BigInteger.valueOf(50688));
//> false
// is prime:
FermatTest.passesAllFermatTests(BigInteger.valueOf(13451));
//> true
// is prime:
FermatTest.passesAllFermatTests(BigInteger.valueOf(34217));
//> true
// isn't prime... 29341 = 13 * 2257
FermatTest.passesAllFermatTests(BigInteger.valueOf(29341));
//> true
Hold on, 29341 isn't a prime number...
// isn't prime... 41041 = 7 * 5863
FermatTest.passesAllFermatTests(BigInteger.valueOf(41041));
//> true
// isn't prime... 52633 = 73 * 721
FermatTest.passesAllFermatTests(BigInteger.valueOf(52633));
//> true
FermatTest.passesAllFermatTests(BigInteger.valueOf(41041));
//> true
// isn't prime... 52633 = 73 * 721
FermatTest.passesAllFermatTests(BigInteger.valueOf(52633));
//> true
41041 and 52633 are not prime numbers either! What's going on?
Remember that caveat I mentioned? This is it. Those numbers are
called Carmichael numbers. They aren't prime, but their unique
property allows them to fool the Fermat test completely.
In reality, this is hardly an issue worth concerning ourselves with.
Carmichael numbers, particularly large ones on the scale of numbers
used in cryptography, are incredibly rare in occurrence. (One of the
footnotes in SICP humorously puts
this in perspective for us: if ever you come across a large enough
number that fools the Fermat test, you've actually got a higher
chance that cosmic
radiation interfered with your computer's processor and threw off the
calculations, than you
do of having randomly landed on a Carmichael number.)
Of course, if we really want to be sure, there is a variant of the
Fermat test that can't be fooled in this way...
The Miller-Rabin Test
Another
probabilistic test that isn't fooled by the Carmichael numbers is the
Miller-Rabin test. Other than that fact, it works very similarly to
the Fermat test. In fact, we start by stating Fermat's Little Theorem
in a different, but equivalent way:
The
trick here* is to rewrite the exponent n-1 in terms of a power of two
(2s)
multiplied by an odd integer (d). With that done, we can then pick a
random number a between 1 and n, and calculate ad
mod n.
Then
we iterate through squaring this result (to make it up to an-1
mod n). However, at each point we check whether we had a nontrivial
square root of 1 mod n. If such a nontrivial root exists, then n
cannot be prime.
[*
Those who've seen the SICP solutions will notice that I've used a
different algorithm for the Miller-Rabin test here. The procedure
used in SICP turns out to be heavily dependent on the fact that its
expmod procedure uses delayed operations under recursion, and so
rewriting it as an iterative process is quite complicated. On the
other hand, we can quickly find iterative algorithms for the Miller-Rabin test that are easy enough to both understand and
translate into working Java code.]
In
terms of our code, we've got a bit more work to do with this test.
The structure of the code is very similar to the Fermat test one,
except that for a given n, we need to first determine the representation
n-1 = 2s
* d. Since that's a relatively expensive calculation to be doing in
every single test we want to run, we can move it out into a separate
private method, findSDRepresentation(), that is invoked whenever we
call the isPrime() or passesAllMRTests() methods.
A
consequence of this though is that our object has state that it needs
to keep track of, so we either have to pass s and d around as
parameters (and the method signatures are already pretty long as it
is), or store them as class fields (which I've done here). The
downside of this is that it no longer allows us to make any of the
methods static, so we have to instantiate a new class even to run the
“all values” test. In that regard it differs radically from the
FermatTest class we had earlier.
Then
the main brunt of the work occurs in the passesMRTest() method. Here
we work through the algorithm detailed above and return true or false
depending on whether the test passes.
The
class definition for our MillerRabinTest class is quite long, and not as readable as the others,
but here it is:
We can then use it to run tests in the usual way:
java MillerRabinTest 1234514132413241
//> Running test: MillerRabinTest
//> n: 1234514132413241
//> Prime?: true
//> Time taken: 9 milliseconds.
java MillerRabinTest 6075380529345458860144577398704761614649
//> Running test: MillerRabinTest
//> n: 6075380529345458860144577398704761614649
//> Prime?: true
//> Time taken: 40 milliseconds.
java MillerRabinTest 1234567654321
//> Running test: MillerRabinTest
//> n: 1234567654321
//> Prime?: false
//> Time taken: 1 milliseconds.
//> Running test: MillerRabinTest
//> n: 1234514132413241
//> Prime?: true
//> Time taken: 9 milliseconds.
java MillerRabinTest 6075380529345458860144577398704761614649
//> Running test: MillerRabinTest
//> n: 6075380529345458860144577398704761614649
//> Prime?: true
//> Time taken: 40 milliseconds.
java MillerRabinTest 1234567654321
//> Running test: MillerRabinTest
//> n: 1234567654321
//> Prime?: false
//> Time taken: 1 milliseconds.
And more importantly, we can show that it doesn't get tricked by the Carmichael numbers:
MillerRabinTest test = new MillerRabinTest(1);
// Not prime: 50688 = 123 * 456
test.passesAllMRTests(BigInteger.valueOf(50688));
//> false
// is prime:
test.passesAllMRTests(BigInteger.valueOf(13451));
//> true
// is prime:
test.passesAllMRTests(BigInteger.valueOf(34217));
//> true
// isn't prime... 29341 = 13 * 2257
test.passesAllMRTests(BigInteger.valueOf(29341));
//> false
// isn't prime... 41041 = 7 * 5863
test.passesAllMRTests(BigInteger.valueOf(41041));
//> false
// isn't prime... 52633 = 73 * 721
test.passesAllMRTests(BigInteger.valueOf(52633));
//> false
// Not prime: 50688 = 123 * 456
test.passesAllMRTests(BigInteger.valueOf(50688));
//> false
// is prime:
test.passesAllMRTests(BigInteger.valueOf(13451));
//> true
// is prime:
test.passesAllMRTests(BigInteger.valueOf(34217));
//> true
// isn't prime... 29341 = 13 * 2257
test.passesAllMRTests(BigInteger.valueOf(29341));
//> false
// isn't prime... 41041 = 7 * 5863
test.passesAllMRTests(BigInteger.valueOf(41041));
//> false
// isn't prime... 52633 = 73 * 721
test.passesAllMRTests(BigInteger.valueOf(52633));
//> false
(The
running time here is also tricky to determine, particularly since we have that extra work in factoring the powers of two from n-1. Other than that it will also depend directly on the speed of the modPow() function, as the Fermat test did. So we won't really worry about this.)
So we've seen some different ways to test for prime numbers here today. In fact, these are in active use! That BigInteger class we used actually has a method called probablePrime() that will tell us whether the integer is prime to a given degree of certainty. If you have a quick read of the documentation for the method in that link, you can probably guess what it's doing! Sure enough, when we check the source code we find that uses the Miller-Rabin test that we just saw.
No comments:
Post a Comment