After three posts (1, 2, 3) on calculating prime numbers, it is probably worth putting that knowledge to a more useful task. As we will see in a near future, integer factorization, i.e. breaking down a (composite) number into its prime factors is one such task. In purity, factoring a number n is simply decomposing it as the product of two smaller non-trivial, i.e. different from 1 and n itself, divisors. But by repeatedly factoring the divisors one will eventually come up with a unique set of primes which, when multiplied together, render the original number, or so says the fundamental theorem of arithmetic… The point is, we will consider factorization a synonym of prime decomposition, be it formally correct or not.

There are some very sophisticated methods to factor very large numbers, but they use a lot of extremely complex math, so I doubt they will ever find their way onto this blog. So we are going to be left with the naive, straightforward approach as our only option, although I will try to give it an efficiency boost. What is this naive approach? Trial division, of course: given a number n, we know that its smallest factor will be smaller than the square root of n, so we can simply try and see if any of those numbers divide it. No, I will not try to code that yet… If you have read the entries on determining prime numbers, it should come as no surprise that we really do not need to do trial division by all numbers smaller than the square root of n, but only by the primes within. This is a consequence of the fact that, if a composite number divides n, then each of the prime factors of that composite number will also divide n. According to the prime number theorem the number of primes below x is asymptotic to x / log x. So by limiting our trials to prime numbers we can reduce the number of tests from n^{1/2} to something around 2 n^{1/2} / log n. If we rescue the `primeListSofE`

function from the post on the sieve of Erathostenes, a python implementation of naive factorization could look something like this…

from time import clock def factor(n, verbose = False) : """ Returns all prime factors of n, using trial division by prime numbers only. Returns a list of (possibly repeating) prime factors """ t = clock() ret =[] nn = n maxFactor = int(n**0.5) primes = primeListSofE(maxFactor, verbose) for p in primes : while nn % p == 0 : nn //= p ret += [p] if nn == 1 : break if nn != 1 : ret += [nn] t = clock() - t if verbose : print "Calculated factors of",n,"in",t,"sec." return ret

While this function will be about as good as we can make it for numbers which are the product of two large prime factors, it will be terribly inefficient for most numbers. Consider, as an extreme example, that we are trying to factor 2^{55} ~ 3.6·10^{16}. We would first calculate all primes up to 1.9·10^{8}, a challenging feat in itself with our available tools, only to find out that we only needed the first of those primes, i.e. 2. Taking into account that 50% of all numbers are divisible by 2, 33% are divisible by 3, 20% are divisible by 5… it doesn’t seem wise to disregard the potential time savings. What we can do to profit from this is to do the trial division checks at the same time as we determine the prime numbers, updating the largest prime to test on the fly. This has to be done in two stages, the first while we sieve up to n^{1/4}, the second while we search the rest of the sieve up to n^{1/2} searching for more primes. The following Franken-code has been written mostly by cut-and-paste from `primeListSofE`

and `factor`

, which hopefully hasn’t affected its readability much:

from time import clock def factorAndSieve(n, verbose = False) : """ Returns all prime factors of n, using trial division while sieving for primes. Returns a list of (possibly repeating) prime factors """ t = clock() ret =[] nn = n while nn % 2 == 0 : # remove 2's first, as 2 is not in sieve nn //= 2 ret += [2] maxFactor = int(nn**0.5) maxI = (maxFactor-3) // 2 maxP = int(maxFactor**0.5) sieve = [True for j in xrange(maxI+1)] i = 0 for p in xrange(3, maxP+1,2) : # we then sieve as far as needed if p > maxP : break i = (p-3) // 2 if sieve[i] : while nn % p == 0 : nn //= p ret += [p] maxFactor = int(nn**0.5) maxI = (maxFactor-3) // 2 maxP = int(maxFactor**0.5) if nn == 1 : break else : i2 = (p*p - 3) // 2 for k in xrange(i2, maxI+1, p) : sieve[k] = False index = i for i in xrange(index, maxI+1) : # and inspect the rest of the sieve if i > maxI : break if sieve[i] : p = 2*i + 3 while nn % p == 0 : nn //= p ret += [p] maxFactor = int(nn**0.5) maxI = (maxFactor-3) // 2 maxP = int(maxFactor**0.5) if nn == 1 : break if nn != 1 : ret += [nn] t = clock() - t if verbose : print "Calculated factors of",n,"in",t,"sec." print "Stopped trial division at",2*i+3,"instead of",int(n**0.5) return ret

This new code will very often be much faster than the other one, but at times it will be just about as slow as in the other case, or even slower, since the mixing of both codes introduces some inefficiencies. The most extreme examples of such cases would be a prime number, or the square of a prime number on one side, and a power of 2 on the other one.

The graph above plots times to calculate the factors of numbers between 10^{6} and 10^{6} + 100. Prime numbers in this interval stick out as the red dots among the blue ones: 10^{6} +3, +33, the twin primes +37 and +39, +81 and +99. And numbers with many small prime factors populate the bottom of the red cloud.

If the above graph is not enough to convince you of the benefits of the second approach, maybe this timings for very large numbers will:

>>> factor(10**15+37,True)

Calculated primes to 31622776 in 6.760 sec.

Calculated factors of 1000000000000037 in 8.466 sec.

[1000000000000037L]

>>> factorAndSieve(10**15+37,True)

Calculated factors of 1000000000000037 in 8.666 sec.

Stopped trial division at 31622775 instead of 31622776

[1000000000000037L]

```
```

`>>> factor(2**55,True)`

Calculated primes to 189812531 in 42.811 sec.

Calculated factors of 36028797018963968 in 43.261 sec.

[2, ..., 2]

>>> factorAndSieve(2**55,True)

Calculated factors of 36028797018963968 in 8.632e-05 sec.

Stopped trial division at 3 instead of 189812531

[2, ..., 2]

[…] By the way, the factorization loop I’m using is the “factorAndSieve” code from Numerical Recipes, slightly modified to return a list of tuples, instead of just a (possibly repeating) list of […]

Your factorAndSieve solution does not appear to be any better than a properly implemented trial-division approach, such as Sundar’s (as detailed on http://thetaoishere.blogspot.com/2009/04/finding-largest-prime-factor.html ). Indeed, your solution fails for very large numbers because it attempts to allocate memory for its sieve, and either runs out of memory or runs into a type conversion problem (I tested it with python 2.6 and a composite number equal to the product of all primes less than or equal to 659). A simpler +2+4 implementation similar to Sundar’s I wrote on the spot had no trouble with this number, and factored it in “0.0 sec” even though it was 276 digits long.

The trivial example of how much faster your factorAndSieve implementation is than an extremely naive trial-division implementation in the case of a large power of 2 is irrelevant because the speed advantage in this case comes not from the sieving but from dividing out the discovered factor and recalculating the upper bound to which trial-divisions must be made. A non-sieving algorithm can do this just as well as yours, and mine does a pretty good job.

I’d be delighted if you could produce an example of a number this factor-and-sieve algorithm could actually outperform a more ordinary trial-division algorithm on.

The number you propose, though very large, has no large divisor, hence there is little benefit in doing anything other than trial division. Actually, most numbers have lots of small divisors: 50% of all numbers are divisible by 2, 33.3% are divisible by 3, 20% are divisible by 5, 14.3% are divisible by 7… So most of the time trial division is a good choice to get the job done.

But sometimes it is not: try any number with a large prime factor and you are in for miserably slow code. With my first example, 10^15+37, which is a prime, I seriously doubt you will get an answer within your lifetime! And Sundar’s tricks (which are great optimizations for everyday numbers, don’t get me wrong) are useless: there isn’t much practical use in dividing a billion years of computing time by 4…

It is true, though, that what I propose is very inefficient in memory use, and it will not work with very large easy numbers that are a piece of cake for trial division. Even if it meant sacrificing some efficiency, it could be a good idea to compute primes as fast as when sieving, but allocating memory on the go. This is a somewhat contradictory statement, but there apparently are ways of doing it. Or at least so says this paper sitting in my “to read in detail” folder:

http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf

Now that you’ve asked, I may even try to figure out the details, turn that Haskell code into python and revive this blog…

Actually, my implementation factors 10^15+37 in 1.69 seconds. It can do so because it stops at the square root of the number — a much easier target to hit. And the target can be lowered every time we find a factor.

Here is the code, if you’re interested:

def factor(n, verbose = False):

“””

Returns all prime factors of n, using simple trial division.

Returns a list of (possibly repeating) prime factors

“””

t = clock()

ret =[]

nn = n

# Remove 2’s and 3’s first

while nn % 2 == 0:

nn //= 2

ret += [2]

while nn % 3 == 0:

nn //= 3

ret += [3]

maxFactor = int(nn**0.5)

# Prime factors after 2 and 3 will be of the form 6x+-1

# This is equivalent to alternately adding 2 and 4 to each

# successive factor-candidate.

oldnn = nn

i = 5

while i >> from factorAndSieve import *

>>> factor(10**15+37,True)

Calculated factors of 1000000000000037 in 1.69 sec.

[1000000000000037]

>>> factorAndSieve(10**15+37,True)

Calculated factors of 1000000000000037 in 5.61 sec.

Stopped trial division at 31622775 instead of 31622776

[1000000000000037]

>>>

Perhaps a number that has a few large prime factors would produce different results?

But to me it appears that bailing out at the square root of the current number is at least as efficient as sieving as you go. Here’s why I think that is: The sieve-as-you-go strategy is to reduce the number of trial-divisions you must make. But finding primes for this purpose is every bit as expensive (perhaps more so) as those same trial-divisions you hoped to avoid. You must still iterate over your sieve looking for true values that indicate the presence of a prime, and every time you find a prime, you must go through the remainder of the sieve marking multiples false. This is a lot of work. The only thing that would really make your approach perform better would be if a trial-division was a much more expensive operation than the combination of several comparison branches, reads, and writes. The mod operation /is/ pretty expensive on most modern processors, but my guess is it’s not expensive enough to give your approach a serious competitive advantage.

WordPress seems to have rendered my code’s formatting ugly and deleted some important lines. I’ll see if there’s a place I can host it or something.