1,000,000 primes in 6.9 seconds!
BlitzMax Forums/BlitzMax Programming/1,000,000 primes in 6.9 seconds!
| ||
so I was reading some wiki articles on number theory and prime numbers and I had an idea for a faster prime number program. This one finds the first 1,000,000 prime numbers in edit: new faster method Global num:Int = 15485864 Global sqrnum:Int = Sqr(num)+1 Global PrimeFlags:Byte[num] Global Primes:Int[num] Local tim:Int Local mill:Int = MilliSecs() For Local x:Int = 2 Until sqrnum If primeflags[x] = 0 Then tim = x Repeat tim:+x primeflags[tim] = True Until x+tim => num EndIf Next Local count:Int For Local y:Int = 2 Until num If primeflags[y] = False Then primes[count] = y count:+1 EndIf Next mill = MilliSecs()-mill Print mill/1000.0 +" seconds" Rem For Local i:Int = 0 Until count Print primes[i] Next EndRem Print count + " primes found!" heres the old code: Local PR:Int = 3 Local primelist:Int[1000001] Local prcnt:Int = 1 primelist[0] = 2 Local n:Int = 0 Local flag:Byte = False Local c:Int = MilliSecs() Repeat n = 0 flag = False While primelist[n]*primelist[n] <= PR If ((pr Mod primelist[n])=0) Then flag = True Exit EndIf n:+1 Wend If Not flag primelist[prcnt] = pr prcnt:+1 EndIf pr:+2 Until prcnt = 1000000 c = MilliSecs()-c 'For i:Int = 0 Until 1000000 ' Print primelist[i] 'Next Print "First 1,000,000 primes in "+c/1000.0+" seconds!" UPDATED if you want to see the numbers without risking freezing your computer, you could just display the first 1000 or the last 1000 be sure NOT to run in DEBUG MODE Last edited 2010 Last edited 2010 Last edited 2010 Last edited 2010 Last edited 2010 |
| ||
Just out of curiosity, what processor do you have in your PC? It takes 12.91 seconds on my computer (Core 2 Duo E6600 @ 2.4GHz, windows 7 x64) |
| ||
core 2 duo t8300 @ 2.4 ghz & 2.4 ghz this is what my control panel says... although I thought it was 2.6 ghz until now... and its win vista 32 bit make sure you are not in debug and you have no other programs running... i consistently get 6.9 something, never more or less than that for the first 2 digits ok just did it in debug and got 14.5 seconds.. maybe thats youre problem? youre processor http://ark.intel.com/Product.aspx?id=27250 mine http://ark.intel.com/Product.aspx?id=33099 they look almost identicle to my untrained eye edit: looks like the time it takes to do this is more affected by read/write speeds for your ram considering it reads a ton of prime numbers off of your ram for every number it tests. Last edited 2010 |
| ||
First 1,000,000 primes in 5.90299988 seconds! On bootcamped XP! Intel i3 CPU 550@... Dabz |
| ||
I got 15.5 seconds by running this code on my laptop. (non-debug mode) With debug mode on, I got 23.97 seconds. Win Vista 32 bit, processor: Core 2 Duo T7200 @ 2GHz. Last edited 2010 |
| ||
Just for comparison, I ran this on my iMac (2.66 GHz Intel Core 2 Duo) with OSX 10.5.8 I got 6.339 seconds for non-debug and 15.796 seconds for debug mode. |
| ||
If you want a challenge make a multithreaded one! |
| ||
The oiginal code ran in 8.71700001 seconds on a Core 2 Duo T6400. Changing the line If (pr/(primelist[n]*1.0)) = Int(pr/primelist[n]) Then to If ((pr Mod primelist[n])=0) Then made it run in 2.86400008 seconds instead :p Last edited 2010 Last edited 2010 |
| ||
It took 2hrs and 23minutes on my Vic 20 :( |
| ||
@zzz thanks! I cant believe I overlooked that! thanks edit: updated! Last edited 2010 |
| ||
:D First 1,000,000 primes in 1.79700005 seconds! |
| ||
hmmm translated it to ti-basic and it takes at least several hours if not days on my calculator. I didnt want to run my batteries down! |
| ||
:D Last edited 2010 |
| ||
translated to C: <5.809s> on my Slow 2GHz Core 2 Duo iMac :D BMax: <5.918s> on the same machine iMac, Intel Core 2 Dou 2.0 GHZ <update> C: <5.726s> with LLVM 1.5 on same Machine :D Last edited 2010 |
| ||
I guess my computer sucks. I've got a (dual) 2.5ghz cpu, and in both win 7 (32bit) and ubuntu (32bit) it runs in 10 seconds, over and over again. hmm, time to upgrade I guess. ;( |
| ||
After the code-change (suggested by zzz), it now runs in 5.97 seconds instead of 15.5 seconds on my laptop. |
| ||
Nate, or someone else, here is the sieve of eratosthene. Super fast. Done in 0.962000012 seconds. Total primes generated: 1000000 This is the time on my computer. Can someone help me clean up the code. It's needs to be better able to find a specific number of primes. But I think you'll agree this is much faster. After the prime arrary is built, it could be turned into a bit table and take up far less ram. The end for loop is to print sections of the prime array to verify the numbers are actually primes. Have fun. P.S. Don't program when your drunk! ha ha SuperStrict Const stopme:Int=100000000 Local flags:Int[stopme+1] Local i:Int Local theprimes:Int[1000000] For i=0 To stopme flags[i]=-1 Next Local z:Int=0 Local endprime:Int=15485863 Local test:Int=2 Local x:Int=0 Local count:Int=0 z=MilliSecs() For Local test:Int=2 To endprime If flags[test]<>-1 Then Continue EndIf x=test If flags[test]=-1 Then theprimes[count]=test count=count+1 flags[test]=test EndIf Repeat flags[test+x]=test x=x+test Until x>=endprime Next z=MilliSecs()-z Print "Done in "+z/1000.0+" seconds." Print "Total primes generated: "+count For i=25 To 50 Print theprimes[i] Next Last edited 2010 Last edited 2010 Last edited 2010 Last edited 2010 Last edited 2010 |
| ||
Hmm. my bit array for the first 1,000,000 primes is still 125,000 bytes long.... :) |
| ||
thats interesting, I dont see how using that method would be twice as fast as mine... but i get 1.17 seconds with yours which is twice as fast as mine and 7 times as fast as the original! :p I wonder what the fastest way ever though of is to find primes... I cant seem to find many methods for finding prime numbers. |
| ||
Isn't there something called the "Sieve of Eratosthenes " or something - very quick to find primes.... http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes |
| ||
Um, LOL, Matty, that's what I programmed. Not very well, but it does work. Nate: The sieve is super fast because it's doing what the computer does best: count... (no division, just simple memory copying and very simple condition loops...) Using elipitical curves is the fastest method for large numbers, but not small numbers. Sieve of Eratosthenes is the fastest for most primes, but of course you have to have the memory to build the array. For my program (badly written) it only needs: Total Loops: 47,086,551 Your program needs: Total Loops: 504,508,950 The only real fast way, is to create a table (bit database) of precalculated primes and just do a lookup. That is really the fastest way if your program is going to need to access primes very frequently. (Factoring for example.) If your interested start here: http://primes.utm.edu/prove/ And if you want to make your head hurt go here: http://polymathprojects.org/category/finding-primes/ And of course you could also start with the obvious: http://en.wikipedia.org/wiki/Integer_factorization :D Last edited 2010 |
| ||
Sorry Shortwind, I didn't read all the posts (my bad...) |
| ||
I doubt this will make much difference to anyone here as everyone seem to have pretty fast computers but by removing the second if statement in the for loop and removing one unnecessary calculation in the last demo, I was able to reduce the time on my wife's ancient imac 1.2ghz from 13 secs to 7.5 secs. I tried it in my netbook and it was prety much unchanged.SuperStrict Const stopme:Int=100000000 Local flags:Int[stopme+1] Local i:Int Local theprimes:Int[1000000] For i=0 To stopme flags[i]=-1 Next Local z:Int=0 Local endprime:Int=15485863 Local test:Int=2 Local x:Int=0 Local count:Int=0 z=MilliSecs() For Local test:Int=2 To endprime If flags[test]<>-1 Then Continue EndIf x=test theprimes[count]=test count=count+1 flags[test]=test Repeat x:+test flags[x]=test Until x>=endprime Next z=MilliSecs()-z Print "Done in "+z/1000.0+" seconds." Print "Total primes generated: "+count For i=25 To 50 Print theprimes[i] Next Last edited 2010 |
| ||
LOL, thanks Jesse. I didn't see that until you pointed it out. :D It's better now, but I still code for crap on the fly. Ha! Ha! :D |
| ||
Thats nice :p Feels a little weird that 10x more iterations only doubles the time taken. Also, theres a sieve of atkin (sp?) around, which, if i dont remember wrong uses some clever way to calculate prime candidates instead of just working through the entire 2..n range like we are doing here. |
| ||
Wrote a simple sieve of atkin based on the wikipedia pseudocode; The original function runs in about 2,9 seconds, the "other sieve" :o takes about 1,4 seconds on average and this one does the work in just under 0,8 for me :D Supposedly this one can be optimized dramatically also. |
| ||
I completely forgot about this one. This sieve only needs 16,911,909 loops to find 1,000,862 primes. Nice! And doesn't require the all the extra memory that sieve of Eratosthenes. Cool. On one final note it should be known that all primes greater than 6 are of the form: 6k + or - 1 :D Last edited 2010 |
| ||
0.4 seconds! I cant believe how much faster its gotten. This is kind of a mix of the 4 ideas presented above by zzz, jessie and shortwind and me. I don't know what you would call it since I just combined ideas but its a sieve I guess. Global num:Int = 15485864 Global sqrnum:Int = Sqr(num)+1 Global PrimeFlags:Byte[num] Global Primes:Int[num] Local tim:Int Local mill:Int = MilliSecs() For Local x:Int = 2 Until sqrnum If primeflags[x] = 0 Then tim = x Repeat tim:+x primeflags[tim] = True Until x+tim => num EndIf Next Local count:Int For Local y:Int = 2 Until num If primeflags[y] = False Then primes[count] = y count:+1 EndIf Next mill = MilliSecs()-mill Print mill/1000.0 +" seconds" Rem For Local i:Int = 0 Until count Print primes[i] Next EndRem Print count + " primes found!" Last edited 2010 Last edited 2010 Last edited 2010 |
| ||
0.52 seconds here, Nate is on top again :p EDIT: just figured out how i can make some massive improvements on the atkin sieve ;D Last edited 2010 |
| ||
8 core 2010 2.4ghz Mac pro (under OS X, under vista in bootcamp first post variant was actually a tiny bit slower, tough oddly the same install through parallels was the same as the mac... very wierd) First post variant (current form with the first optimization update) 2.48600006 solidly nate's release 2 posts from here up 0.186000004 to 0.189999998 running stuff in the background but that shouldn't matter... would REALLY love to see a multi-threaded variant... 8 cores, with hyper-threading wants to know... :0) Last edited 2010 Last edited 2010 |
| ||
nevermind this post, was just OS randomness Last edited 2010 |
| ||
@im747, couldnt see any difference in the generated assembly so i guess its just a 'placebo' effect :) |
| ||
Got the aktin sieve down to 0.58 seconds with some basic optimizations, Nates latest version is still slightly faster :p Last edited 2010 |
| ||
Slight improvement on nates latest code, keep finishing at around 0,49 seconds for me. Will probably run slower on older machines. |
| ||
first 10,000,000 primes in 6.0 secondsSuperStrict Framework brl.standardio Import brl.math Global num:Int = 179424674 Global sqrnum:Int = Sqr(num)+1 Global PrimeFlags:Byte[num] Global Primes:Int[num] Local tim:Int Local mill:Int = MilliSecs() For Local x:Int = 2 Until sqrnum If primeflags[x] = 0 Then tim = x Repeat tim:+x primeflags[tim] = True Until x+tim => num EndIf Next Local count:Int=0 primes[count]=2 count:+1 For Local y:Int = 3 Until num Step 2 primes[count]=y count:+(primeflags[y]=False) Next mill = MilliSecs()-mill Print mill/1000.0 +" seconds" Rem For Local i:Int = 0 Until count Print primes[i] Next EndRem Print count + " primes found!" thanks for the optimization zzz |
| ||
Sweet 2.87800002 seconds 10000000 primes found! |
| ||
Nice :P Finished in 7.13600016 seconds for me |
| ||
Modified Nates a bit, finishes in around 6,85 for me now. |
| ||
A real mess, i tend to not delete any code i write when "scribbling" like this, instead putting comments all over the place :P Anyways, nates latest sieve inspired me to write this one, the speed increase is pretty much only from reduced memory overhead. Churns out a million primes in ~0,145 seconds and 10 million in ~4,9 seconds on my core2duo t6400 (2ghz). EDIT: cleaned up the code EDIT2: small change suggested by Czar EDIT3: changed the prime counting code a little, now doing ~0,11s/1m primes for me. Last edited 2010 Last edited 2010 Last edited 2010 |
| ||
Hey zzz, what are the masks for? I dont see how the masks are used to optimize it? |
| ||
Will making the masks constant improve speed more? |
| ||
I used the div/mod mask to get rid of division. cpus calculate the remainder as a byproduct when doing division so the div/idiv opcodes gets used for either operation, and theyre really slow.. And with a fixed known value its pretty straightforward to use bitwise operators instead. EDIT: that doesnt really have much impact on the code, at least not in a positive way. the main speed increase comes from reducing the memory footprint by using bits for flags instead of bytes. The extra fiddling is outweighted by the reduced memory overhead by a lot :) (and thats also a big reason to why it isnt much faster on the 10m prime test) The third 'mask' is just a quickie way to mark of all even numbers, 16 numbers at a time :o @Czar: that works indeed, got it down to ~0,12seconds/1m primes for me :) sample code: Last edited 2010 Last edited 2010 Last edited 2010 Last edited 2010 Last edited 2010 |
| ||
Its now doing 1m primes in under 0,1 seconds for me :P, getting steady 0,097/0,099 second results. 10m primes in ~4,5 seconds. Last edited 2010 |
| ||
This is very exciting watching this thread over the last few days. zzz, Nate, super fun! Still analyzing your last bit of code zzz. A pretty original line of thought there! :D |
| ||
wow zzz, 3.9 seconds, 10,000,000 primes found. I believe we have reached a sort of road block for efficiency of counting primes but zzz may prove me wrong. (it may be faster in asm hint hint haha) Anyway, I hadn't touched bmax for a while so this helped me really get into it and pick it back up. thanks! |
| ||
@Shortwind, i realized it was the way to go when i messed around with nates latest creation. changing the flags array to ints instead of bytes made it way slower, so figured it would work the other way around too. @Nate, try doing 1m primes instead, it will be way faster since the 10m prime test re-introduces the memory issue i tried to get rid of.. :) I dont think ill mess around much more with this. Assembly wont really help since its mostly a memory issue now, the algo touches memory all over the place over and over again so it is most likely racking up a huge number of chache/page mispredictions. A single L2 misprediction can cost over 200 cycles accordingly to google, and for reference i think an ordinary integer addition is a one cycle ordeal. One could probably rewrite the code to work around this in some way, but i cant really come up with any ideas atm. |
| ||
Expanded a bit on the preliminary culling, the code gives a pretty solid average of 0,075 seconds / 1m primes :D Last edited 2010 |
| ||
With the above code: 0.009200... seconds. i7 extreme OC'd :D |
| ||
0.128000006 seconds 1000000 primes found! i7 950 @ original 3.06GHz (non-debug) |
| ||
This thread is fantastic, well done all! Fascinating. |
| ||
0.0540000014 1000000 primes found i7 920 @ 2.98 GHZ (non-debug) |
| ||
Had a bunch of school-related stuff to finish today, so figured the sensible thing to do was to write yet another sieve to post here. Same as the last one pretty much. Continuing on the memory footprint philosophy i threw out even numbers completely. EDIT: on my computer it does 30m primes in just under 10 seconds, where 10m takes just over 2 seconds, so it scales fairly well this time. Set limit to 573259392 if you want to try :) EDIT2: updated the prime number counting code so now it should run considerably faster on the 1m prime test too. For me its slightly more than twice as fast as the last one i posted. Last edited 2010 Last edited 2010 |
| ||
Am I the only one who still have some interest in this? Nate, Jesse, Shortwind? :) You guys probably have better things to do (im studying ~2days a week atm so i dont), just wondering if I should continue to post updates etc. I could try and clarify whats going on in the code a bit if anyone wants me to. I havent done that much more than obfuscating the code from the original sieve so its not hard to understand the logic in any way. Last edited 2010 |
| ||
Sorry, zzz, haven't lost interest, just been busy. I believe the code we have in this thread is very usefull, and a good learning experience. What I'd suggest now is that you modify your code to quickly, with reasonable memory requirements, find the X'th prime number, the next prime, and the previous prime. (If you feel so inclined.) In other words, if I tell your program that I want the 1,578th prime number, how fast can your program return that number? If I give your program the number 95,456, how fast can it tell me the next or previous prime number? And finally, (you'll have to start a new thread), what is the fastest method in BlitzMax for "factoring" a number into it's prime elements? :D |
| ||
That functionality would come as a bonus if i could get a segmented sieve to work properly (and fast enough). Youd still have a bit of overhead but it would be able to quite quickly calculate enough primes to be able to answer those questions :) Ive been thinking about upping the performance of the sieve by packing values.. The last piece of code excludes even numbers completely, so: a) I get a smaller memory footprint since the algorithm needs one bit per two values in the 1-limit range when culling prime candidates. b) The culling code can be made much more efficient. Initially i hade the problem of prime multiples not existing in the flag array. For example the first multiple of 5 is 10, which is even so it doesnt exist as far as the sieve is concerned. This one was pretty easy to figure out and had the nice side-effect of making the culling loop twice as fast since i could skip every second multiple completely. Ive been tinkering a bit with excluding multiples of 2,3,5 and 7 in the same way, but this makes the 1-limit range non-linear as opposed to just excluding odd numbers.. It would give the flag array a value density of around 440% (ie, 4,4values per bit) tho if i figured it out, which would give a very nice performance boost :D After i nail those two ill get a factorization thread going :P Last edited 2010 |
| ||
Proof of concept for a segmentet sieve. Seems to work properly so time to trim the code a bit i guess. Also opens up for multithreading which is always nice |
| ||
Messed around a bit today with this and managed to get a segmented sieve going. The code needs some major cleaning up which i dont feel like doing now so heres some pretty numbers to look at instead: edit: changed my mind, heres the code instead. Bit of warning that while it churns out primes pretty fast (100000000th prime took 12sec for me) it gobbles up lots of memory.. The TARGET constant is what prime the code will find so mess around with that. edit2: small optimization. its currently doing 10m primes in just over 0,7 seconds for me :P Last edited 2010 Last edited 2010 Last edited 2010 |
| ||
A bit of thread necromancy here. I started refactoring the above code into something useful but lost interest soon afterwards.. Remembered the topic and figured id post it here anyways if someone is interested. Feel free to try it and post some execution times :) |
| ||
The 100000000th prime is 2038074743. ( total running time: 8.94799995 seconds ) from my company notebook (Intel Core 2 Duo, 2GHz, 2 GB RAM) The 100000000th prime is 2038074743. ( total running time: 7.98600006 seconds ) from my Intel Core 2 6400 2,13 GHz, 2 GB RAM (growing old now) -- Peter Last edited 2011 |
| ||
The 100000000th prime is 2038074743. ( total running time: 5.26900005 seconds ) work pc: i3 3.06GHz |
| ||
The 100000000th prime is 2038074743. ( total running time: 6.32000017 seconds ) Mac Pro (mid 2010) 8 core (2x4 2.4ghz Xenon) with a few things in the background (shouldn't make much of a difference, extra cores and such) Wonder if memory/bus speed would start having an impact with that high a load generated in such a short time... |
| ||
Hi Nate the Grate! It works just great on my laptop: 1.48599994 seconds 1000000 primes found! My computer type is: Intel Core 2 CPU T5200 @ 1.6 Ghz speed (Centrino Duo) HP Pavilion DV6245us 4GB memory, 32-bit, Windows 7 |
| ||
The 100000000th prime is 2038074743. ( total running time: 3.68799996 seconds ) 64-bit Quad Core i5-3450 @3.1GHz 12GB RAM 64-bit Windows 7 Home Premium SP 1 AMD Radeon HD7770 1GB Using the edited code in first post: 0.194999993 seconds 1000000 primes found! Last edited 2012 |