Faster sin/cos aproximation

BlitzMax Forums/BlitzMax Programming/Faster sin/cos aproximation

SLotman(Posted 2010) [#1]
After reading this page, I tried to port it to BMAX to see the difference from normal sin/cos functions; so here's the code:



And the output I get:
normal sin, 100000 times:2389
fast sin, 100000 times:1937
fast sin was 18% faster
------------------------------------
normal cos, 100000 times:2381
fast cos, 100000 times:1995
fast cos was 16% faster
------------------------------------
normal acos, 100000 times:15147
fast Acos, 100000 times:8299
fast Acos was 45% faster
------------------------------------


I read that this could be optimized even further (no branching, and even pasting it directly on code, instead of inside a function)...

So what do you think? What results do you get? Any ideas on how to speed it up even more? :)

Edit: Added a fastACos I found here, which yells an impressive 45% speed improvement!


Zeke(Posted 2010) [#2]
normal sin, 100000 times:1772
fast sin, 100000 times:1243
fast sin was 29% faster
------------------------------------
normal cos, 100000 times:1731
fast cos, 100000 times:1234
fast cos was 28% faster
------------------------------------
normal acos, 100000 times:3450
fast Acos, 100000 times:693
fast Acos was 79% faster
------------------------------------

thats quite good..
Any ideas on how to speed it up even more? :)
use assembly :D


Dreamora(Posted 2010) [#3]
if you want fast performance then it would make dozens times more sense to optimized the compiler flags etc to use ffast-math and alike


EOF(Posted 2010) [#4]
From my EeePC901
normal sin, 100000 times:6489
fast sin, 100000 times:4416
fast sin was 31% faster
------------------------------------
normal cos, 100000 times:6520
fast cos, 100000 times:4603
fast cos was 29% faster
------------------------------------
normal acos, 100000 times:36932
fast Acos, 100000 times:22956
fast Acos was 37% faster



Zeke(Posted 2010) [#5]
but some times you will need accuracy and not so speed.. so hows that...
i have tested some times.. but im not sure.. because im not so good with this kind "math"...


Otus(Posted 2010) [#6]
If you just need the 360 full degree values, you should use a lookup table. If loss of precision is fine, you could use lookup + linear correction with an appropriately sized table. Note that you can use the same tables for both sin and cos if you want to.

ffast-math won't help in Blitz code. Any benchmarks should use random values instead of sequential, since cache behavior and branch prediction is much faster with sequential data. (Unless sequential is the expected use case...)


ImaginaryHuman(Posted 2010) [#7]
Calculate a lookup table, ie precalculate a whole bunch of results into an array, it's sure to be faster to just read off a float from memory than to do all the calculations every time. The one caveat is the accuracy is limited to memory storage.


Midimaster(Posted 2010) [#8]
If you really need only the values of exact 360 integer degrees like in your code sample, try a table and you will be 6 to 8 times faster than your code:

Global FastSin#[361]

For i%=0 To 360
   FastSin[i]=Sin(i)
Next

zeit%=MilliSecs()
	For f2%=0 To 100000
		For f%=0 To 360
		   s=FastSin[f]
		Next    
	Next
Print MilliSecs()-Zeit


If you need more accuracy you can do it the same way, but with more entries. Here is an example based on 1/1000-degrees:

Global FastSin#[360001]

For i%=0 To 360000
   Winkel#=i/1000.0
   FastSin[i]=Sin(Winkel)
Next

zeit%=MilliSecs()
	For f2%=0 To 100000
		For f%=0 To 360
		   s=FastSin[f*1000]
		Next    
	Next
Print MilliSecs()-Zeit


but this will be only 3 to 4 times faster than your code!