Here comes my take at one of the most ubiquitous algorithms you can find googling…

The least common multiple of 2 numbers, lcm(a,b), is the smallest number that is divisible by both a and b. The greatest common divisor of 2 numbers, gcd(a,b), is the largest number that divides both a and b. But just how do you go about calculating them? A most important property is that one can be calculated from the other based on the relation lcm(a,b)·gcd(a,b) = a·b, which ends up meaning that in practice the gcd is always calculated first. But lets not get ahead of ourselves…

Brute Force

I will disregard the dumber brute force approaches, such as “to get gcd(a,b), try all numbers down from min(a,b), until you find one that divides both a and b”, but a very resource wasteful implementation could use a sieving approach for the lcm. See, the lcm has to be smaller than a·b, which is definitely a multiple of both a and b. So if we set up an array of zeros of size a·b, we could use sieving to first set all multiples of a to 1, then do a search over all multiples of b to find the first common multiple. This approach will definitely break down for even moderately large a or b, but it is probably as good as most of us would get on our own without wikipedia…

from time import clock def lcmBruteForce(a, b, verbose = False) : t = clock() sieve = [0]*a*b a,b = max(a,b), min(a,b) for j in range(a-1,a*b,a) : sieve[j] += 1 ret = 0 for j in range(b-1,a*b,b) : if sieve[j] == 1 : ret = j + 1 break t = clock()-t if verbose : print "Calculated the lcm of",a,"and",b,"in",t,"sec." return ret

Integer Factorization

In my case I believe it was Ernesto, back in 1985 or 1986, who showed me into the secret of calculating the lcm and the gcd from the prime factorization of both numbers:

- to find the gcd, keep only factors common to both and b, raised to the smaller exponent, and
- for the lcm, keep all factors, common or not, raised to the largest exponent.

This approach will prove to be very inefficient, but lets give it a go. We will reuse the `factorAndSieve`

function from the post on integer factorization. I have also written an auxiliary function, `listToDict`

, that turns the list of factors with repeating entries, into a dictionary, where keys are factors and corresponding entries are exponents:

import time def list_to_dict(seq) : ret = dict() for j in seq : if j in ret : ret[j] += 1 else : ret[j] = 1 return ret def gcd_factor(a, b, verbose = False) : t = time.clock() aa = list_to_dict(factorAndSieve(a)) bb = list_to_dict(factorAndSieve(b)) rr = dict() for j in bb : if j in aa : rr[j] = min(aa[j], bb[j]) ret = 1 for j in rr : ret *= j**rr[j] t = time.clock() - t if verbose : print "Calculated the gcd of",a,"and",b,"in",t,"sec." return ret

The Euclidean Algorithm

When we looked at the calculation of the modular multiplicative inverse, we already took a look at the extended Euclidean algorithm, and pointed out it gives the gcd of the two passed numbers as a by-product. Not that it was an interesting thing then, since when calculating multiplicative inverses, we normally are interested in relatively prime number pairs, i.e. gcd(a,b) = 1. But if the adjective extended is dropped from the name, we are left with the Euclidean algorithm, which is the standard to calculate the gcd.

The basic idea behind the algorithm is the observation that if a > b, and g is the gcd of a and b, then it is also the gcd of a – b and b. To make it more efficient, rather than subtracting, we can cut corners by using the equivalent fact that, if g is the gcd of a and b, then it is also the gcd of b and a mod b. If we perform that same operation recursively, after a finite number of iterations, the result of the modular operation will eventually reach zero, and the gcd then is equal to the last non-zero remainder. Or

import time def gcd_euclid(a, b, verbose = False) : t = time.clock() aa, bb = a,b while b != 0 : a, b = b, a%b t = time.clock() - t if verbose : print "Calculated the gcd of",aa,"and",bb,"in",t,"sec." return a

**The Binary GCD Algorithm**

Euclid’s algorithm is so fast that there really is not much point in trying to improve it. If we were using C rather than python, we could have taken advantage of the fact that a computer that does all its arithmetic in binary, and use the binary GCD algorithm. In python the resulting code is less clear, and the run times are slightly slower, probably owing to the fact that bitwise operations “don’t get optimized down to single CPU instructions (…) like in C,” or so says a commenter here… The truth is I haven’t been able to implement the algorithm so that it runs faster than the standard Euclidean one, so I’ll spare you my code…

The google link at the top of the article now has this article at number 9 !

I don’t like the way the timing code is embedded in each function – I think it makes the functions less clear. The timing code could be taken outside each function, for example by using a function decorator.

I’m not keen on the ‘verbose’ code for similar reasons.

Python 2.6 introduced the ‘fractions’ module which includes a gcd function (implemented in Python in an identical/equivalent way to your function). I think placing gcd in with fractions was a poor choice – gcd is useful for fractions, but has many other uses too and is a general purpose mathematical function.

Up to 7 by now! Google mysteries, that a post written for completeness of another post, without too much attention, since the same thing has been told and retold a zillion times, and which ends up not even being necessary, turns me into a one hit wonder…

As for the timing in the functions… I’d say that my python knowledge is probably par with Mickey’s magic in the Sorcerer’s Apprentice, so my first reaction to “function decorators” was something like: “Uhhh!?” But a few minutes of googling show a promising way of improving code readability, will definitely look into it!