Heavy use of Cos, Sin

BlitzMax Forums/BlitzMax Programming/Heavy use of Cos, Sin

JBR(Posted 2009) [#1]
Hi,

Was converting some code from B3D and found it to be slower in BMax.

I think it is because Cos & Sin return double floats.

I've replaced with an array of floats for Cos & Sin and get a x2 speed increase.

I should say I'm using very heavy use though.

Jim


Jesse(Posted 2009) [#2]
What are your computer specs. I have notice that it makes some difference in low end computers but is hardly worth the time in fast newer computers.


JBR(Posted 2009) [#3]
Using i7 920

Jim


Armitage 1982(Posted 2009) [#4]
I did the same in my game when trying to optimize my particle engine and believe it or not but Cos / Sin array were slower than actually using them...
But I remember having huge gain of performance back in BVM Script or blitz3D.
I choose to ignore that solution by the way.


JBR(Posted 2009) [#5]
Local i,time:Int
Local value:Float
	
Global A_Cos#[ 1024+1 ]
For i=0 To 1024
	A_Cos#[i] = Cos( (i/1024.0)*360.0 )
Next

time = MilliSecs()
For i=1 To 1000000
	value# = Cos( Rnd(0,360) )
Next
Print "Calculated :" + (MilliSecs()-time)

time = MilliSecs()
For i=1 To 1000000
	value# = A_Cos#[ (Rnd(0,360)/360.0) * 1024 ]
Next
Print "PreCalclated :" + (MilliSecs()-time)



WaitKey()
End



I get 185 vs 145 = ~80%
I think it depends on the program you are writing because I'm getting nearer 60% in my app.

Jim


ziggy(Posted 2009) [#6]
Depending on the requiered precision you can do this:
bmax code:
Import "myfuncs.c"
Extern "c"
	Function fsin:Float(value:Float)
End Extern


c code (myfuncs.c):
float fsin(float value) {
	return sin(value * 0.0174532925199432957692369076848861);
}


This will make a bit speed improvement. Not much, but a bit.


JBR(Posted 2009) [#7]
Tried that ziggy but was same as normally calculated.

I got my times wrong in an earlier post, should be 89 vs 52 which is the 60% I got earlier.

Jim


ziggy(Posted 2009) [#8]
@JBR: Are you aware than 90% of the cost of your calculation is becouse of the usage of rnd? At last on my computer it's like this. I see a small diference using this code:
Import "myfuncs.c"
Extern "c"
	Function fsin:Float(value:Float)
End Extern
Const Iterations:Int = 20000000
Local i, time:Int, value:Float
	
Global A_Cos:Float[1024 + 1]
For i=0 To 1024
	A_Cos#[i] = Cos( (i/1024.0)*360.0 )
Next
Print "starting test..."

time = MilliSecs()
For i = 1 To Iterations
	fsin(i)
Next
Print "F-calculated:" + (MilliSecs() - time)

time = MilliSecs()
For i = 1 To Iterations
	Sin(i)
Next
Print "Calculated :" + (MilliSecs()-time)

time = MilliSecs()
For i = 1 To Iterations
	value = A_Cos[((i Mod 360) / 360.0) * 1024]
Next
Print "PreCalclated :" + (MilliSecs()-time)

Input(">")
End

Precalculated is the faster, but not much accurated, while standard is the slower but more exact. As expected!


Jesse(Posted 2009) [#9]
I admit it is noticeable faster. Although, if the Rnd function and the division is taken out of the calculations, it makes a much bigger difference.
Local i,time:Int
Local value:Float
	
Global A_Cos#[ 1024 ]
Global ang#[1024]
For i=0 Until 1024
	A_Cos#[i] = Cos( (i/1024.0)*360.0 )
	ang[i] = Rnd(0,360)/360.0
Next

time = MilliSecs()
For i=0 Until 1000000
	value# = Cos( ang[i & 1023] ) '1023 in binary is = 01111111111
Next
Print "Calculated :" + (MilliSecs()-time)

time = MilliSecs()
For i=0 Until 1000000
	value# = A_Cos#[ ang[i & 1023] * 1024 ] '1023 in binary is = 01111111111 
Next
Print "PreCalclated :" + (MilliSecs()-time)





Calculated :50
PreCalclated :24

*************EDITED***********


Floyd(Posted 2009) [#10]
Arrays should be faster if they fit your needs with no interpolation between stored values or clamping angles to 0..360.

The real mystery is the speed difference between BlitzMax and Blitz3D. They should do Sin and Cos at just about the same speed because these are calculated to the full 80-bit internal precision in any case.


Jesse(Posted 2009) [#11]
Arrays should be faster if they fit your needs with no interpolation between stored values or clamping angles to 0..360.

Agree. It becomes clear when dealing with negative angles sence values have to be adjusted to fit the array. Those extra calculations might take up the difference.


Vilu(Posted 2009) [#12]
When I was experimenting with improving trigonometry performance, I found that avoiding unnecessary casting (Double -> Float etc) had a larger impact on performance than using pre-populated arrays.

I believe modern 64-bit processors - even when they're running 32-bit OS - are equally fast dealing with doubles or floats as the CPU registers are 64-bit. Ever since this "enlightenment" I've mostly been using doubles/longs for performance when the added memory overhead is not an issue.

So, using doubles for functions that return doubles is faster compared to using floats, and vice versa.