say that I’ve truly enjoyed browsing your blog posts. After all I’ll be subscribing to

your feed and I hope you write again very soon! ]]>

Very good tutorial.

I wanted to know how to do this for a pdb (protein data bank) file.

Thanks and regards,

]]>Here is the code, if you’re interested:

def factor(n, verbose = False):

“””

Returns all prime factors of n, using simple trial division.

Returns a list of (possibly repeating) prime factors

“””

t = clock()

ret =[]

nn = n

# Remove 2’s and 3’s first

while nn % 2 == 0:

nn //= 2

ret += [2]

while nn % 3 == 0:

nn //= 3

ret += [3]

maxFactor = int(nn**0.5)

# Prime factors after 2 and 3 will be of the form 6x+-1

# This is equivalent to alternately adding 2 and 4 to each

# successive factor-candidate.

oldnn = nn

i = 5

while i >> from factorAndSieve import *

>>> factor(10**15+37,True)

Calculated factors of 1000000000000037 in 1.69 sec.

[1000000000000037]

>>> factorAndSieve(10**15+37,True)

Calculated factors of 1000000000000037 in 5.61 sec.

Stopped trial division at 31622775 instead of 31622776

[1000000000000037]

>>>

Perhaps a number that has a few large prime factors would produce different results?

But to me it appears that bailing out at the square root of the current number is at least as efficient as sieving as you go. Here’s why I think that is: The sieve-as-you-go strategy is to reduce the number of trial-divisions you must make. But finding primes for this purpose is every bit as expensive (perhaps more so) as those same trial-divisions you hoped to avoid. You must still iterate over your sieve looking for true values that indicate the presence of a prime, and every time you find a prime, you must go through the remainder of the sieve marking multiples false. This is a lot of work. The only thing that would really make your approach perform better would be if a trial-division was a much more expensive operation than the combination of several comparison branches, reads, and writes. The mod operation /is/ pretty expensive on most modern processors, but my guess is it’s not expensive enough to give your approach a serious competitive advantage.

]]>