Heavy use of Cos, Sin
BlitzMax Forums/BlitzMax Programming/Heavy use of Cos, Sin
| ||
Hi, Was converting some code from B3D and found it to be slower in BMax. I think it is because Cos & Sin return double floats. I've replaced with an array of floats for Cos & Sin and get a x2 speed increase. I should say I'm using very heavy use though. Jim |
| ||
What are your computer specs. I have notice that it makes some difference in low end computers but is hardly worth the time in fast newer computers. |
| ||
Using i7 920 Jim |
| ||
I did the same in my game when trying to optimize my particle engine and believe it or not but Cos / Sin array were slower than actually using them... But I remember having huge gain of performance back in BVM Script or blitz3D. I choose to ignore that solution by the way. |
| ||
Local i,time:Int Local value:Float Global A_Cos#[ 1024+1 ] For i=0 To 1024 A_Cos#[i] = Cos( (i/1024.0)*360.0 ) Next time = MilliSecs() For i=1 To 1000000 value# = Cos( Rnd(0,360) ) Next Print "Calculated :" + (MilliSecs()-time) time = MilliSecs() For i=1 To 1000000 value# = A_Cos#[ (Rnd(0,360)/360.0) * 1024 ] Next Print "PreCalclated :" + (MilliSecs()-time) WaitKey() End I get 185 vs 145 = ~80% I think it depends on the program you are writing because I'm getting nearer 60% in my app. Jim |
| ||
Depending on the requiered precision you can do this: bmax code: Import "myfuncs.c" Extern "c" Function fsin:Float(value:Float) End Extern c code (myfuncs.c): float fsin(float value) { return sin(value * 0.0174532925199432957692369076848861); } This will make a bit speed improvement. Not much, but a bit. |
| ||
Tried that ziggy but was same as normally calculated. I got my times wrong in an earlier post, should be 89 vs 52 which is the 60% I got earlier. Jim |
| ||
@JBR: Are you aware than 90% of the cost of your calculation is becouse of the usage of rnd? At last on my computer it's like this. I see a small diference using this code:Import "myfuncs.c" Extern "c" Function fsin:Float(value:Float) End Extern Const Iterations:Int = 20000000 Local i, time:Int, value:Float Global A_Cos:Float[1024 + 1] For i=0 To 1024 A_Cos#[i] = Cos( (i/1024.0)*360.0 ) Next Print "starting test..." time = MilliSecs() For i = 1 To Iterations fsin(i) Next Print "F-calculated:" + (MilliSecs() - time) time = MilliSecs() For i = 1 To Iterations Sin(i) Next Print "Calculated :" + (MilliSecs()-time) time = MilliSecs() For i = 1 To Iterations value = A_Cos[((i Mod 360) / 360.0) * 1024] Next Print "PreCalclated :" + (MilliSecs()-time) Input(">") End Precalculated is the faster, but not much accurated, while standard is the slower but more exact. As expected! |
| ||
I admit it is noticeable faster. Although, if the Rnd function and the division is taken out of the calculations, it makes a much bigger difference.Local i,time:Int Local value:Float Global A_Cos#[ 1024 ] Global ang#[1024] For i=0 Until 1024 A_Cos#[i] = Cos( (i/1024.0)*360.0 ) ang[i] = Rnd(0,360)/360.0 Next time = MilliSecs() For i=0 Until 1000000 value# = Cos( ang[i & 1023] ) '1023 in binary is = 01111111111 Next Print "Calculated :" + (MilliSecs()-time) time = MilliSecs() For i=0 Until 1000000 value# = A_Cos#[ ang[i & 1023] * 1024 ] '1023 in binary is = 01111111111 Next Print "PreCalclated :" + (MilliSecs()-time) Calculated :50 PreCalclated :24 *************EDITED*********** |
| ||
Arrays should be faster if they fit your needs with no interpolation between stored values or clamping angles to 0..360. The real mystery is the speed difference between BlitzMax and Blitz3D. They should do Sin and Cos at just about the same speed because these are calculated to the full 80-bit internal precision in any case. |
| ||
Arrays should be faster if they fit your needs with no interpolation between stored values or clamping angles to 0..360. Agree. It becomes clear when dealing with negative angles sence values have to be adjusted to fit the array. Those extra calculations might take up the difference. |
| ||
When I was experimenting with improving trigonometry performance, I found that avoiding unnecessary casting (Double -> Float etc) had a larger impact on performance than using pre-populated arrays. I believe modern 64-bit processors - even when they're running 32-bit OS - are equally fast dealing with doubles or floats as the CPU registers are 64-bit. Ever since this "enlightenment" I've mostly been using doubles/longs for performance when the added memory overhead is not an issue. So, using doubles for functions that return doubles is faster compared to using floats, and vice versa. |