crucial math code

BlitzMax Forums/BlitzMax Programming/crucial math code

Najdorf(Posted 2005) [#1]
ok, I have a function that is called many, many times every frame and is the bottleneck of my whole program. In particular there is this

atan2(.,.)

call that is the real bottleneck.

is there a way I can make it faster (I dunno, make a C call or somethin'?)

Thx,

Matteo


TartanTangerine (was Indiepath)(Posted 2005) [#2]
Pre-calculate the results and store them in an array. You can then call them like newAtan2[dy,dx] . But remember that you can not do something like newAtan[0.5,0.4] so you will have to use some multiplication factor.


TartanTangerine (was Indiepath)(Posted 2005) [#3]
Here is some code
'Put this in the global section of your code, this only needs to be called once.
Global nATan2:Double[200,200]
Local a:Double
Local b:Double
For a = 0 to 200
  For b = 0 to 200
    nATan2(a,b) = ATan2((a/100)-1,(b/100)-1))
  Next
Next

Function newATan2(dy:Double,dx:Double)
 dx = Abs((dx + 1) * 100)
 if dx > 200 then dx = 200
 dy = Abs((dy + 1) * 100)
 if dy > 200 then dy = 200
 Return nAtan2(dy,dx)
end function



Najdorf(Posted 2005) [#4]
whoa, crazy trick... might actually work...


TartanTangerine (was Indiepath)(Posted 2005) [#5]
Sure does work for me. Not that accurate though since there are limited sample points.


levent(Posted 2005) [#6]
Hi,

Make sure that the loops terminate at 199 as the global array is initialized to 200 elements.

Levent


TartanTangerine (was Indiepath)(Posted 2005) [#7]
Oh yeah, I pulled this from Blitz3d, forgot that.


Najdorf(Posted 2005) [#8]
unfortunately I cant use this after all. it's too imprecise.

Any other ideas?


Andy(Posted 2005) [#9]
Sorry!


Koriolis(Posted 2005) [#10]
Ahem, we are in a *BlitzMax* forum ... ;)


TartanTangerine (was Indiepath)(Posted 2005) [#11]
Make it more precise, instead of 200 sample why not 2000 samples?


skidracer(Posted 2005) [#12]
There is an ATan2 optimization featured in the very first Blitz User Magazine (for the Amiga computer), there is a 2D rotating shooter at the end of issue 1, although I doubt an integer 68000 version of atan2 is going to be relevant but the approximation may be of interest:

http://www.nitrologic.net/archives/amiblitz

You are using floats in your program? And how are you benchmarking these bottlenecks you talk of?


TartanTangerine (was Indiepath)(Posted 2005) [#13]
Here is the ATan2 funcition in ASM (in Purebasic), I *think* you can import this. Since ATan2 is built into the FPU this should be real quick.

This returns radians and not degrees
Procedure.f atan2f(y.f,x.f)
  !fld dword[esp]
  !fld dword[esp+4]
  !fpatan
EndProcedure



ImaginaryHuman(Posted 2005) [#14]
What if you have your lookup table for the Atan2, but then when you know that you need a value that is someway between the accuracy of the table, use weighted interpolation to give you the approximate result?

ie if your table says:

The Atan2 of 10 is 100
The Atan2 of 11 is 103

(just for the sake of this example)

and you wanted to know the Atan2 of 10.43, you would do something like:

Variable=(Atan2Lookup(10,10)*0.43)+(Atan2Lookup(11,11)*(1-0.43))

????? Or is that still slow?

(I know it would not be accurate since Atan2 doesn't increase linearly)


Najdorf(Posted 2005) [#15]
Thx, that's neat. I'll give it a try.
Also in truth atan2 only needs the ration dy/dx as argument.


Robert(Posted 2005) [#16]
The ATan2 function is a simple wrapper around the GNU standard C library's atan2 function, which I should imagine is pretty heavily optimised.

As skid asked, how are you benchmarking this?


Najdorf(Posted 2005) [#17]
I know it's the bottleneck because I have to call it many many times each frame, and when I dont call it the maximum framerate increases a lot.


Dreamora(Posted 2005) [#18]
Do you have it to call that often or wouldn't it result in a visual difference if you only called it every 2nd frame or time based 30 / 60 times per second?


Najdorf(Posted 2005) [#19]
I have to call it every frame about 12000 times.


Robert(Posted 2005) [#20]
I did a quick test here and 12000 calls to ATan2 only takes 3ms or so. (ie. You could do 330x12000 calls per second).

When you remove the calls to ATan2, you must be substituting some default values in their place. Perhaps this is what is causing the performance improvements?