The Euclidean Algorithm is one of the first algorithms ever written down. It is a darling of computer science. The algorithm cleverly reveals the greatest common divisor of two natural numbers. Despite its ancient origins, it is still well-used today (and not just for teaching). It can determine when two numbers are relatively prime – ie, when their greatest common divisor is one – which has important modern applications in cryptography. And it does so efficiently.
Here is a cute implementation:
return a if b == 0 else gcd(b,a % b)
(NB. Error checking omitted for clarity.)
Look ma, recursion! To spell it out, when b is zero, then gcd(a,b) = a. When b is non-zero, then gcd(a,b) = gcd(b, r), where r is the remainder of a divided by b. Let’s do a short example: gcd(12, 8). We have:
Ok, so what the heck is going on here? The gcd algorithm is deceptively simple looking. However, it leverages some rather deep properties of the natural numbers.
First, a little background. The titular Euclid, that crafty fellow, once proved something called the Fundamental Theorem of Arithmetic, which says that any number can be expressed (uniquely!) as the product of primes. For example,
12 = 2 x 2 x 3 = 4 x 3
8 = 2 x 2 x 2 = 4 x 2
You can see by inspecting the prime factors that the gcd is 4. But factoring numbers is computationally difficult, especially when they get larger. The gcd algorithm is so efficient (and remarkable) in part because it doesn’t have to do any factoring!
One useful fact is that when a number divides x and y, that number also divides x + y and x – y. You can see this by playing around with prime factors again. In the example above:
12 – 8 = 4 x 3 – 4 x 2 = 4 x (3 – 2)
12 + 8 = 4 x 3 + 4 x 2 = 4 x (3 + 2)
You can see that 4 is a factor not only of 12 and 8, but also 12 + 8, and 12 – 8. Still with me?
Ok, one other thing to get under your belt is the division theorem. It says we can come up with a unique whole number quotient q and remainder r for any natural number division a / b. That is, a = qb + r for some unique whole numbers q and r, where 0 <= r < b.
Finally, we can write out step by step what is happening in the Euclidean algorithm. We start with the arguments a and b, and put a row in the table which shows the current arguments along with what is computed (ie, the remainder a % b). We add a row for each recursive invocation.
|step||a||b||a % b|
|… etc …|
At each step, we pass in as arguments to the gcd function the values in columns a and b, and we compute the remainder a % b of their division and, if we want, the quotient, but we can ignore the quotient for now. Wash, rinse, repeat.
Let’s check out gcd(12, 8):
|step||a||b||a % b|
In the nth iteration, the final remainder, r[n], is zero. Why is this? It’s because the remainder r of a division such as a = q b + r is strictly less than b, so the arguments shrink inexorably throughout the computation, and eventually you get down to zero.
In row n, where r[n] = 0, the dividend and divisor are r[n-2] and r[n-1], so we know that r[n-1] divides r[n-2] evenly (since there is no remainder to the division!) And clearly r[n-1] divides itself. So that means r[n-1] divides r[n-3] too, since it divides both terms of the sum which defines it (see row n-1). Likewise, we can work all the way back up the call stack in order to say that r[n-1] divides both the original arguments a and b as well.
Wait! We still don’t know whether that final non-zero remainder r[n-1] is the greatest among all possible common divisors. We do know that divisor r[n-1] is <= gcd, by definition of gcd. But we need to prove one more statement: ANY common divisor of a and b divides ALL the remainders r[i] of the algorithm.
ORLY? Well, just look at the procedure above and notice how the first remainder can be written r = a – q b. So anything that divides a and b must divide r. Likewise, r = b – q r, so anything that divides b and r must divide r. Following the turtles all the way down the rabbit hole: any common divisor of a and b divides ALL remainders r[i].
One common divisor in particular is the gcd. Since it divides r[n-1] in particular, we can say that gcd <= r[n-1].
But if gcd <= r[n-1] and r[n-1] <= gcd, doesn’t that mean they are equal? Oh yes indeedy-do it does. Quod erat demonstratum, baby!
PS: for some visual intuition of the algorithm, check out the line-segment drawings in the top right hand corner of the Wikipedia entry.