Int vs Float

Monkey Targets Forums/Android/Int vs Float

Grover(Posted 2012) [#1]
This is probably not Android specific, but I was wondering if anyone had a good understanding of the underlying benefits between the two on a Android platform. Many of the hardware architectures do perform many times better using Int. However, does the vm take care of this for us? Is there a cost/benefit?

Im going to run some trials myself over the next few days just to determine whether the impacts are large/small, but I'd be keen to hear from peoples own experience on this issue.

I think (based on previous gamedev GBA/DS/Nokia/SE exp) that there is probably good value in a simple Int math lib or similar to help increase general Android performance (Id expect you would get IOS boost too). FPU's generally are power consuming, and slow on most embedded systems (the GBA and DS didnt even have one!).

Thanks,
Dave


Xaron(Posted 2012) [#2]
There are no such differences on modern hardware (like smartphones have) anymore.


Grover(Posted 2012) [#3]
Really? Even PC's are still around 40x faster in Int math than float - unless doing vectorized (ie 4 lots of 32bit/64bit float - then SIMD is going to improve things), which is only around 10x faster.

Id be very surprised if the ARM chips had large and good FPU's onboard with the high clocks they are running. As I said even recent NDS's (game handhelds) dont have _any_ FPU's at all.

Hence why I suspected there would be benefit in this. Will do some digging.
<edit>I found this: http://pandorawiki.org/Floating_Point_Optimization
Which seems to confirm my suspicions (many costs involved). Will run some tests, I have an old GBA ARM Int math lib here I'll try out. See if there is much benefit.

Cheers,
Dave


skid(Posted 2012) [#4]
From my experience there have been 3 basic generations of ARM chips. No FPU, FPU as copro and FPU integrated.

Older Apple hardware prior to iPhone3GS falls into middle category and will die badly if you compile float code in thumb mode but perform excellently in ARM mode where copro calls don't get hit with huge cycle penalty.

Cheaper android kit with specs below the 600mHz would I guess fall into first category, emulated FP and no integer divide instruction.

Float usage on GPU is even muddier, where geometry buffers will work faster using fixed point formats and others prefer float. Android itself has introduced known issues with floating point buffers so there is no clear rules here either.

For this reason I always assumed it would be best Monkey support the Canvas interface for graphics where it's implementation could be expected to be tuned to perform on whatever that native implementation of the GPU. Fat chance. Sadly canvas was only GPU enabled in recent Android releases and only in certain situations.

Even PC's are still around 40x faster in Int math than float


That sounds like an extreme corner case to me.


Grover(Posted 2012) [#5]
Having done games and sims for the last twenty years from fixed platforms to PC there are _alot_ of things people assume without actually testing. Another classic is the use of the stl libs - these are _very_ performance detrimental, but they are convenient. And PC's are still slower on the Float side vs Int for general purpose use (GPU is quite diff story, but hard to use). Its much like the above ARM, where 20cycles for a float operation setup can be a pretty drastic cost when switching/changing int to float math. PS3, Xbox and others all suffer similar conditions. Most development works in "streaming data blocks" to reduce context switching and so on - which helps the FPU's as a whole.

From previous games and simulators written on PC, it has always been the case to have our own "Float <-> Int" conversion at the very least (this can cost pipeline flushes, and potentially hundreds of cycles). Also, previous performance of changing math critical portions of an app to fixed _did_ result in a 40x increase in performance (this was using VTune for profiling). That was about 2 yrs ago. And I have doubts compilers and CPU's have dramatically improved their internals.

What is the biggest change, and most likely to push away all this sort of discussion, is the implementation of JIT hardware modules in the CPU's. The above mentioned ARM has one for Java built in and so this chip is far more favorable to just running pure bytecode than even assembled code. This is where MS is going with its .NET series - Intel hw running JIT .NET bytecode.. this will push everyone up a level of api :) about time...

I currently have some test apps almost complete. So I will know the answers to the questions soon anyway. I'll do a secondary CPP test as well to see where that currently stands on modern hardware. My suspicion for targeting baseline (low performing phones) it will improve perf substantially - on the more modern ARM chips it probably will be limited.

The test app is going to be float/math intensive - Box2DLite. So there should be good data from it.

Cheers,
Dave


skid(Posted 2012) [#6]
I'll take an int version of Box2D, yes please.