v41 OpenGL Performance

Monkey Targets Forums/Android/v41 OpenGL Performance

DGuy(Posted 2011) [#1]
(As this is not a bug, I felt posting it here also, and not just a thread in the bug forum, would be proper ...)

With v41, I'm seeing:

-> *LOSS* of 3-9 FPS using a speedtest from here: http://www.monkeycoder.co.nz/Community/post.php?topic=869&post=7366

-> *LOSS* of 2-3 FPS in my current in-development app

As Monkey switched internally from FloatBuffers to IntBuffers for storage of OGL vertex related information, the performance difference is probably because some OGL hardware handles float operations faster than integer operations and visa-versa.

Some have seen *GAINS* in fps with the new "int' version ...

I'll be sticking with the pre-v41 "float" version of 'mojo.android.java' for now as it's faster on the primary android device I'm targeting (i.e. NookColor).

Hmmm, I wonder if a compile time flag is in order ...


AdamRedwoods(Posted 2011) [#2]
From the Android dev source, section "Use Floating-Point Judiciously":
http://developer.android.com/guide/practices/design/performance.html


therevills(Posted 2011) [#3]
As Mark as stated in that thread, the changes to the buffers is the correct way to do it...

Hopefully Mark finds a way around this - JNI?


DGuy(Posted 2011) [#4]
Floats verses Ints, when it comes to buffers that get accessed by OGL, seem to not be so cut-and-dry …

Referring to the link Mark provides in the v41 release notes, it seems while there is/was performance issues with FloatBuffer, it's been remedied in newer versions of Android, such that performance is "significantly faster across the board", even faster than the Integer based optimizations presented, and since the link talks very specifically about using FloatBuffers with OGL, it seems that, when it comes to Android/OGL at least, floats may be more efficient.

Who knowns, maybe the NookColor engineers already implemented some FloatBuffer related enhancements into their code base, which may be why I'm seeing better performance with the pre-v41 usage of FloatBuffers compared to the current usage of IntBuffers.

Just to confuse things more, when I was testing my own OGL code used in my first iOS app, I found using GLshort for all vertex/UV buffers was faster than using floats … <shrug>


DGuy(Posted 2011) [#5]
Anyone interested in a slightly faster* (+0.5-1 fps) implementation of Android Mojo, look here:

<snip>

[Edit]
Moot as of Monkey v42 ...
[/Edit]


dom(Posted 2011) [#6]
If I'm allowed to join in =)

I am drawing 50 rects with texture 1 and 50 rects with texture 2.
(EDIT: all the rects get batched per texture!)
On my HTC Desire, I get 25 FPS.
Since I am using scale with draw rect, I quickly hacked in a custom

int DrawSurface3( gxtkSurface surface,float x,float y, float w, float h, int srcx,int srcy,int srcw,int srch ){
	
		//float w=srcw;
		//float h=srch;
		float u0=srcx*surface.uscale, u1=(srcx+srcw)*surface.uscale;
		float v0=srcy*surface.vscale, v1=(srcy+srch)*surface.vscale;

		Begin( GL10.GL_TRIANGLES,6,surface );

		AddVertex( x,y,u0,v0 );
		AddVertex( x+w,y,u1,v0 );
		AddVertex( x+w,y+h,u1,v1 );
		
		AddVertex( x,y,u0,v0 );
		AddVertex( x+w,y+h,u1,v1 );
		AddVertex( x,y+h,u0,v1 );

		return 0;
	}


Since I would like to draw my rects in absoulte dimensions anyway.
Now I got 27 FPS... profiling the app with DDMS (Eclipse) shows me, that: AddVertex consumes 98%
Of these, java/nio/FloatToByteBufferAdapter.put (F)Ljava/nio/FloatBuffer take 77% and
java/nio/IntToByteBufferAdapter.put (F)Ljava/nio/IntBuffer takes about 20%.


Googleing a bit, this led me to http://www.badlogicgames.com/wiki/index.php/Direct_Bulk_FloatBuffer.put_is_slow ...do you people think a JNI implementation could help us?

Maybe a question @Mark: do you have any plans considering this?
I really love the idea of Monkey, but currently the performance drives me nuts :(


DGuy(Posted 2011) [#7]
A tweak I made to mojo has been to internally make better use of the DepthBuffer to prevent all the over-draw by adding a "SetDepth()" function, modifying the internal draw arrays to store a z-value and toggling the "glDepthMask()" setting based upon whether or not mojo toggles blending on/off.

With that setup, draw as much non-alpha stuff first (front-to-back), then alpha stuff second (back-to-front) all the while specifying depth values.

The result is about a 8+ fps increase in my current app (think a solitair layout where all the cards have alpha and there is a lot of overlap).

It's not ideally implemented (could be faster with better sorting/batching) and only for Android ATM, but the extra FPS have put my mind at ease about performance for the moment ...


therevills(Posted 2011) [#8]
Just wait for the next version of Monkey to be released.

Mums the word ;)


DGuy(Posted 2011) [#9]
@Therevills

Just wait for the next version of Monkey to be released.



You beta-version-testing tease: I HATE your kind! ... ;)


Grover(Posted 2011) [#10]
Hi, first time user of Monkey (looks great!). Just wondering if the above hints from Therevills is some indication there may be a 3D module coming? IM very interested in what Monkey may be able to do in our simulation development world, and if 3D becomes possible across even a few of these platforms, I would be extremely excited about using it for full production.
.. I hope 3D is coming? :)


therevills(Posted 2011) [#11]
Therevills is some indication there may be a 3D module coming?


Nope, it was the new stuff in v43 which works great in some situtations and worst in others...