Faster sin/cos aproximation
BlitzMax Forums/BlitzMax Programming/Faster sin/cos aproximation
| ||
After reading this page, I tried to port it to BMAX to see the difference from normal sin/cos functions; so here's the code: And the output I get: normal sin, 100000 times:2389 fast sin, 100000 times:1937 fast sin was 18% faster ------------------------------------ normal cos, 100000 times:2381 fast cos, 100000 times:1995 fast cos was 16% faster ------------------------------------ normal acos, 100000 times:15147 fast Acos, 100000 times:8299 fast Acos was 45% faster ------------------------------------ I read that this could be optimized even further (no branching, and even pasting it directly on code, instead of inside a function)... So what do you think? What results do you get? Any ideas on how to speed it up even more? :) Edit: Added a fastACos I found here, which yells an impressive 45% speed improvement! |
| ||
normal sin, 100000 times:1772 fast sin, 100000 times:1243 fast sin was 29% faster ------------------------------------ normal cos, 100000 times:1731 fast cos, 100000 times:1234 fast cos was 28% faster ------------------------------------ normal acos, 100000 times:3450 fast Acos, 100000 times:693 fast Acos was 79% faster ------------------------------------ thats quite good.. Any ideas on how to speed it up even more? :) use assembly :D |
| ||
if you want fast performance then it would make dozens times more sense to optimized the compiler flags etc to use ffast-math and alike |
| ||
From my EeePC901normal sin, 100000 times:6489 fast sin, 100000 times:4416 fast sin was 31% faster ------------------------------------ normal cos, 100000 times:6520 fast cos, 100000 times:4603 fast cos was 29% faster ------------------------------------ normal acos, 100000 times:36932 fast Acos, 100000 times:22956 fast Acos was 37% faster |
| ||
but some times you will need accuracy and not so speed.. so hows that... i have tested some times.. but im not sure.. because im not so good with this kind "math"... |
| ||
If you just need the 360 full degree values, you should use a lookup table. If loss of precision is fine, you could use lookup + linear correction with an appropriately sized table. Note that you can use the same tables for both sin and cos if you want to. ffast-math won't help in Blitz code. Any benchmarks should use random values instead of sequential, since cache behavior and branch prediction is much faster with sequential data. (Unless sequential is the expected use case...) |
| ||
Calculate a lookup table, ie precalculate a whole bunch of results into an array, it's sure to be faster to just read off a float from memory than to do all the calculations every time. The one caveat is the accuracy is limited to memory storage. |
| ||
If you really need only the values of exact 360 integer degrees like in your code sample, try a table and you will be 6 to 8 times faster than your code:Global FastSin#[361] For i%=0 To 360 FastSin[i]=Sin(i) Next zeit%=MilliSecs() For f2%=0 To 100000 For f%=0 To 360 s=FastSin[f] Next Next Print MilliSecs()-Zeit If you need more accuracy you can do it the same way, but with more entries. Here is an example based on 1/1000-degrees: Global FastSin#[360001] For i%=0 To 360000 Winkel#=i/1000.0 FastSin[i]=Sin(Winkel) Next zeit%=MilliSecs() For f2%=0 To 100000 For f%=0 To 360 s=FastSin[f*1000] Next Next Print MilliSecs()-Zeit but this will be only 3 to 4 times faster than your code! |