Competition Time! (Cash Prize!) :-D

BlitzMax Forums/BlitzMax Programming/Competition Time! (Cash Prize!) :-D

Brucey

(Posted 2014) [#1]

Hello Blitzers!

I would really (really!) like a fully-working GLMax2D/GLGraphics module which support OpenGL 2+ and OpenGLES 2 - i.e. one which does away with glBegin() and glEnd(), instead using arrays and shaders.

OpenGLES 2 compatibility gives us direct support for rendering on Raspberry Pi, iOS, Android and Angle (the GL->D3D wrapper that is used by Chrome and Firefox on Windows to render WebGL)

Since I am rubbish at coding up such things myself, I have decided to open up a competition, in the hope that someone will help by writing one for the community :-)

In order to add a little incentive, I am offering 100 GBP (that's pounds Sterling. Actual Dollar values may vary at the time of conversion) to the successful winner, to be paid via PayPal on completion and validation of the project.

The resulting work will be open sourced using the zlib/libpng license for all the use as they want.

I know that some of you in the community think writing such a thing is very, very easy, but you know, it's all very well saying "yeah, I could knock one together over the weekend", without actually knocking one together over the weekend.

Words are just vapour-ware after all.

You can collaborate if you like and I can divide the prize amount between you fairly if you wish.

So, in summary, the community would like :

* A new set of compatible modules (i.e with the same exposed APIs) as BRL.GLMax2D and BRL.GLGraphics, but using much more modern OpenGL APIs
* It should be functionally compatible with OpenGL 2+ and OpenGLES 2.
* zlib licensed
* Running on the 3 core platforms (Windows, OS X, Linux) - If you only have one to build on, I can test on all three for you.
* Something as efficient as the current code would be nice. Something faster would of course be better ;-)

The competition will run to the end of December 2014, or before if someone provides a completed project in that time. After the expiry date, I will sadly have to revoke the prize and feel lost and empty.

Feel free to post questions, comments, idea, etc here, (or via email if you wish. Address is in my profile)

Thanks to everyone's support over the years.

Go Community!!

LT	(Posted 2014) [#2]

Very nice of you to offer a cash reward, but I thought you already had something very close to this..?

Brucey

(Posted 2014) [#3]

I thought you already had something very close to this..?

No, it's not quite there, and the stuff I was hacking was somewhat messy.
Feel free to take what's already been done and get it fully working.

I'm just after a final, working product.

LT	(Posted 2014) [#4]

Okay, but it would help to know what specifically is missing. The only part I'm aware of is that textures do not work properly with multi-frame images.

JoshK

(Posted 2014) [#5]

So you need this, working with OpenGLES 2?:
http://www.blitzmax.com/bmdocs/command_list_2d_cat.php?show=Graphics%20-%20Max2D

The hard part is actually initializing an OpenGL context on each OS. The existing GLMax2D will work with OpenGL ES with just a few basic shaders and some small changes.

On Android, for example, I had a huge Java code file with all these C callbacks. I have no idea how you would get something like that running with your setup.

LT	(Posted 2014) [#6]

What's wrong with making the SDL context the default?

Brucey

(Posted 2014) [#7]

The hard part is actually initializing an OpenGL context on each OS.

SDL can handle the context management on the other platforms.

LT	(Posted 2014) [#8]

Does anything need to change in GLGraphics? I can go through the list of commands in GL2Max2D and make sure they all work.

JoshK

(Posted 2014) [#9]

OpenGL ES 2.0 always uses shaders, and I don't think it has fixed-function stuff at all. It's easiest to implement an OpenGL 2.1 renderer using the same approach, and then just make a few small changes for ES. This can be done by using some preprocessor macros in the shaders so the same shader can be used for either.

Debugging OpenGLES is painful, so it's best to test with GL on a PC and then add some if statements for the bits that act differently on ES.

LT	(Posted 2014) [#10]

This is Brucey's GL2Max2D thread. The new module is already using shaders, but it's incomplete.

http://www.blitzmax.com/Community/posts.php?topic=103193

Besides using shaders instead of fixed function and arrays instead of glBegin and glEnd, I'm not sure of all the requirements.

JoshK

(Posted 2014) [#11]

I will do it because GTKMaxGUI saved my ass, and I use a lot of your modules, and MaxIDE 64 for Linux is really nice:
-Timeframe is probably ~3 months.
-I won't deal with SDL, context initialization, or buffer swapping, as these vary from platform to platform.
-I will only implement OpenGL 2.1, the right way, and then make changes for OpenGL ES based on what I think needs to be done. No testing on OpenGLES will be performed.
-The stuff I do will be fairly forward-compatible, and it will all use the fast-track path, but maintain maximum flexibility. No textures atlases, but it will use VBOs.

I already have this stuff written in Leadwerks 3.0 so it's not hard. Does your compiler support IncBin? I'd prefer to pack those shaders into the build, since they are pretty simple.

Wiebo

(Posted 2014) [#12]

If I would have any clue about openGL then I would help, but alas!!

Derron

(Posted 2014) [#13]

@ Others:

Do not stop tinkering around with OGL/EGL even if JoshK wrote he would "do it". It is a competition and it would be not that nice for BMX NG if people stop coding just because there was one who wrote "I do it" as the first.

@JoshK
If you think of doing it regardless of the money, I would enjoy to see that code on a github project - just to see it evolving (and maybe people would be able to assist with things .. writing tests or so).
I know you might be afraid of "people could steal my work" but in the case of you doing it for the "thanks", it should not be a problem.

Of course above is just valid under the assumption of you doing it for the thanks, not for the 100 gbp (I doubt that this is the reason to do it).

bye
Ron

LT	(Posted 2014) [#14]

I got started on it yesterday, but not for the money. I'm not interested in duplicating effort, but three months seems like a long time for this. The existing module is pretty close (already 2.1, afaik) and just needs some clean up work.

zoqfotpik

(Posted 2014) [#15]

As is mentioned elsewhere I would be very interested in a hook for a postprocessing pixel shader. I think this is one of those things that newer games are doing (like those blurry watercolor looking backgrounds, though that isn't really postprocessing.)

I don't just want to throw out a feature wish that isn't really directly related to the core issue but it is at least tangentially related.

Pingus

(Posted 2014) [#16]

I hardly understand what you are doing here guys but if the aim is to port a bmax code to run on Android platform, I think it worths quite some bucks, and rather than a 'contest', why not setup a kickstarter (or similar) which would be redistributated within the people who worked on ?

LT	(Posted 2014) [#17]

Pingus, it's just one part of that, not worthy of a Kickstarter by itself.

Pingus

(Posted 2014) [#18]

Ok but is'nt the whole thing would worth a kickstarter ? And therefore sub-parts would be financially handled also ?

Kryzon

(Posted 2014) [#19]

Remember the PayPal taxes on those 100 GBP.

LT	(Posted 2014) [#20]

Should this be a drop in replacement for GLMax2D or should there be a flag to use the old GLMax2D functionality? I have something that is mostly working - Digesteroids and Breakout are 100% - just have to finish the DrawOval and DrawPoly functions and clean up a little more.

Derron

(Posted 2014) [#21]

If it is a "drop in replacement" there is no need for a "flag" - just use either the new or the old module to have a specific behaviour.

Cant await to test it.

bye
Ron

Brucey

(Posted 2014) [#22]

It needs to be a drop-in replacement. So when I DrawText or whatever in my app, it works as before, except of course using the more modern stuff.

Obviously one could write something much more efficient if they were to drop direct compatibility with Max2D - forcing the user to do things in certain ways - and in the future I think we should aim for such a module. But for now, I think it's important that this can be used as a direct replacement.

LT	(Posted 2014) [#23]

Technically, it's a drop-in replacement either way. What I meant was that I could have a Global called GL_USE_FIXED or something that would cause it to use the old functionality. That way, there would be no need for two separate modules.

However, it's easier if I don't bother, so I won't. ;)

if they were to drop direct compatibility with Max2D

Yeah, I have a separate module for my engine that does things differently. I won't be using Max2D at all, but anything I can do to help this process along... :)

LT	(Posted 2014) [#24]

Speaking of efficiency...

I'm not actually sure if it's better to batch the primitives, which requires that the vertex colors be sent to the shader, or to simply pass the color and alpha and draw each primitive separately. The former is the current setup, but the latter was done in the original GLMax2D and it allows for the use of things like GL_TRIANGLE_STRIP and GL_TRIANGLE_FAN.

Derron

(Posted 2014) [#25]

Best is to aim for 100% compatibility to BlitzMax Max2D.

Efficiency improvements could be done later (or using an extending module - MaxExt2D).

bye
Ron

LT	(Posted 2014) [#26]

Compatibility is not the problem. Batching requires sending everything as PLAIN_TRIANGLE and passing the vertex colors. That's a lot of extra vertices being sent to the graphics card in exchange for fewer draw calls. It is admittedly simpler, though.

Brucey

(Posted 2014) [#27]

I'm not actually sure if it's better to batch the primitives, which requires that the vertex colors be sent to the shader, or to simply pass the color and alpha and draw each primitive separately.

I've no idea. Which is why we're here in this thread ;-)

Ideally, if the user were to make several draws of the same kind in succession, the "engine" would be able to batch those together, thereby in theory, making things work faster.
Obviously, if he/she were to draw different, incompatible things in succession, then one wouldn't gain the "batching advantage".

Whether or not it is easy to code such a thing into the "engine", I've no idea.

LT	(Posted 2014) [#28]

Well, the simplest is to send PLAIN_TRIANGLES so every primitive is kept discrete (can't use TRIANGLE_STRIP or TRIANGLE_FAN as part of a batch). And also to send vertex colors, even for DrawImage, so that's what I'll do for the time being.

However, I've just found that this module doesn't play nicely with my engine, even though the old GLMax2D does. So I'll have to look into that before I can call it a true replacement. :(

EDIT: It's very likely the shader initialization is conflicting with my engine's. I'm not sure if this can work as a drop-in replacement without separating that part.

Derron

(Posted 2014) [#29]

@LT

Any progress? Just find it nice to read about progress, problems, success stories (you know...keeping up the excitement).

bye
Ron

LT	(Posted 2014) [#30]

Any progress?

I haven't worked on it since Saturday morning...got sidetracked with getting my engine updated to GL 2.0 (to make sure they would play nicely together). The issues turned out to be minor ones in the module - had to do a side-by-side comparison with the old GLMax2D to figure that out.

Anyway, what I have is just an update of Brucey's GL2Max2D module, but it should work as advertised and it will be cool to see Digesteroids working on mobile devices. Still have a few things to do, but I'll try to get it posted in a day or two. Incidentally, my version extends the old GLMax2D and can be switched to use the old functionality very easily. I needed that for testing and it seems to me it's not doing any harm, but it's fine with me if you want to just drop it altogether.

Derron

(Posted 2014) [#31]

Glad to hear that there is no serious showstopper laying in the middle of the todo-road.

I also cannot await to see BlitzMax(NG) extending to other platforms (graphically).

bye
Ron

LT	(Posted 2014) [#32]

Here's the new brl.GL2Max2D. I've tested it with the Breakout and Digesteroids samples.

I'm not sure what to do with the DrawPixmap() functionality, so I've left it, for now. DrawOval() and DrawPoly() use TRIANGLE_FAN instead of batching.

Set GLMAX2D_USE_LEGACY = True, if you want to use old functionality.

**** LAST EDIT 11/10/14 ****

Strict

Rem
bbdoc: Graphics/OpenGL 2+ Max2D
about:
The OpenGL Max2D module provides an OpenGL 2+ driver for #Max2D with shader support.
Legacy fixed functionality is included for testing purposes if GLMAX2D_USE_LEGACY = True.
End Rem
Module brl.GL2Max2D

ModuleInfo "Version: 1.00"
ModuleInfo "Author: Mark Sibly, Bruce Henderson, Emil Andersson"
ModuleInfo "License: zlib/libpng"
ModuleInfo "Copyright: Blitz Research Ltd"

ModuleInfo "History: 1.00"
ModuleInfo "History: Initial version."

Import brl.Max2D
Import brl.StandardIO

?Not linuxarm
Import brl.GLGraphics
Import pub.Glew
?linuxarm
Import brl.EGLGraphics
?

Private

Const GLMAX2D_USE_LEGACY = False
Global _driver:TGLMax2DDriver

'Naughty!
Const GL_BGR = $80E0
Const GL_BGRA = $80E1
Const GL_CLAMP_TO_EDGE = $812F
Const GL_CLAMP_TO_BORDER = $812D

Global ix#, iy#, jx#, jy#
Global color4ub:Byte[4]
Global color4f#[4]

Global state_blend
Global state_boundtex
Global state_texenabled

Function BindTex( name )
	If name = state_boundtex Return
	glBindTexture( GL_TEXTURE_2D, name )
	state_boundtex = name
End Function

Function EnableTex( name )
	BindTex( name )
	If state_texenabled Return
	glEnable( GL_TEXTURE_2D )
	state_texenabled = True
End Function

Function DisableTex()
	BindTex( 0 )
	If Not state_texenabled Return
	glDisable( GL_TEXTURE_2D )
	state_texenabled = False
End Function

Function Pow2Size( n )
	Local t = 1
	While t < n
		t :* 2
	Wend
	Return t
End Function

Global dead_texs[], n_dead_texs, dead_tex_seq

'Enqueues a texture for deletion, to prevent release textures on wrong thread.
'
'Not thread safe, but that's OK because all threads are stopped when TGLImageFrame.Delete()
'is called, which is what calls us.

Function DeleteTex( name, seq )

	If seq <> dead_tex_seq Return

	'add tex to queue
	If dead_texs.length = n_dead_texs
		dead_texs = dead_texs[..n_dead_texs + 10]
	EndIf
	dead_texs[n_dead_texs] = name
	n_dead_texs :+ 1

End Function

Function CreateTex( width, height, flags )

	'alloc new tex
	Local name
	glGenTextures( 1, Varptr name )

	'flush dead texs
	If dead_tex_seq = GraphicsSeq
		For Local i = 0 Until n_dead_texs
			glDeleteTextures( 1, Varptr dead_texs[i] )
		Next
	EndIf
	n_dead_texs = 0
	dead_tex_seq = GraphicsSeq

	'bind new tex
	BindTex( name )

	'set texture parameters
	glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE )
	glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE )

	If flags & FILTEREDIMAGE
		glTexParameteri GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR
		If flags & MIPMAPPEDIMAGE
			glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_LINEAR )
		Else
			glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR )
		EndIf
	Else
		glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST )
		If flags & MIPMAPPEDIMAGE
			glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST_MIPMAP_NEAREST )
		Else
			glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST )
		EndIf
	EndIf

	Local mip_level
	Repeat
		glTexImage2D( GL_TEXTURE_2D, mip_level, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, Null )
		If Not ( flags & MIPMAPPEDIMAGE ) Exit
		If width = 1 And height = 1 Exit
		If width > 1 width :/ 2
		If height > 1 height :/ 2
		mip_level :+ 1
	Forever

	Return name

End Function

'NOTE: Assumes a bound texture.
Function UploadTex( pixmap:TPixmap, flags )

	Local mip_level
	Repeat
		'?linuxarm
		glTexImage2D( GL_TEXTURE_2D, 0, GL_RGBA, pixmap.width, pixmap.height, 0, GL_RGBA, GL_UNSIGNED_BYTE, Null )
		For Local y = 0 Until pixmap.height
			Local row:Byte Ptr = pixmap.pixels + ( y * pixmap.width ) * 4
			glTexSubImage2D( GL_TEXTURE_2D, 0, 0, y, pixmap.width, 1, GL_RGBA, GL_UNSIGNED_BYTE, row )
		Next
		'?Not linuxarm
		'glPixelStorei( GL_UNPACK_ROW_LENGTH, pixmap.pitch / BytesPerPixel[pixmap.format] )
		'glTexSubImage2D( GL_TEXTURE_2D, mip_level, 0, 0, pixmap.width, pixmap.height, GL_RGBA, GL_UNSIGNED_BYTE, pixmap.pixels )
		'?

		If Not ( flags & MIPMAPPEDIMAGE ) Exit
		If pixmap.width > 1 And pixmap.height > 1
			pixmap = ResizePixmap( pixmap, pixmap.width / 2, pixmap.height / 2 )
		Else If pixmap.width > 1
			pixmap = ResizePixmap( pixmap, pixmap.width / 2, pixmap.height )
		Else If pixmap.height > 1
			pixmap = ResizePixmap( pixmap, pixmap.width, pixmap.height / 2 )
		Else
			Exit
		EndIf
		mip_level :+ 1
	Forever

	'
	'?Not linuxarm
	'glPixelStorei( GL_UNPACK_ROW_LENGTH, 0 )
	'?

End Function

Function AdjustTexSize( width Var, height Var )

	'calc texture size
	width = Pow2Size( width )
	height = Pow2Size( height )
	Repeat
		Local t
		glTexImage2D( GL_TEXTURE_2D, 0, 4, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, Null )
		?Not linuxarm
		glGetTexLevelParameteriv( GL_TEXTURE_2D, 0, GL_TEXTURE_WIDTH, Varptr t )
		'glGetTexLevelParameteriv( GL_PROXY_TEXTURE_2D, 0, GL_TEXTURE_WIDTH, Varptr t )
		?
		If t Return
		If width = 1 And height = 1 Then RuntimeError "Unable to calculate tex size"
		If width > 1 width :/ 2
		If height > 1 height :/ 2
	Forever

End Function

Function DefaultVShaderSource:String()

	Local str:String = ""

	?linuxarm	
	str :+ "#version 100~n"
	?Not linuxarm
	str :+ "#version 120~n"
	?
	str :+ "attribute vec2 vertex_pos;~n"
	str :+ "attribute vec4 vertex_col;~n"
	str :+ "varying vec4 v4_col;~n"
	str :+ "uniform mat4 u_pmatrix;~n"
	str :+ "void main(void) {~n"
	str :+ "	gl_Position=u_pmatrix*vec4(vertex_pos, -1.0, 1.0);~n"
	str :+ "	v4_col=vertex_col;~n"
	str :+ "}"
	
	Return str

End Function

Function DefaultFShaderSource:String()

	Local str:String = ""
	
	?linuxarm	
	str :+ "#version 100~n"
	str :+ "precision mediump float;~n"
	str :+ "varying vec4 v4_col;~n"
	str :+ "void main(void) {~n"
	str :+ "	gl_FragColor=vec4(v4_col);~n"
	str :+ "}~n"
	?Not linuxarm
	str :+ "#version 120~n"
	str :+ "varying vec4 v4_col;~n"
	str :+ "void main(void) {~n"
	str :+ "	gl_FragColor=v4_col;~n"
	str :+ "}~n"
	?
	
	Return str

End Function

Function DefaultTextureVShaderSource:String()

	Local str:String = ""

	?linuxarm
	str :+ "#version 100~n"
	?Not linuxarm
	str :+ "#version 120~n"
	?
	str :+ "attribute vec2 vertex_pos;~n"
	str :+ "attribute vec4 vertex_col;~n"
	str :+ "attribute vec2 vertex_uv;~n"
	str :+ "varying vec4 v4_col;~n"
	str :+ "varying vec2 v2_tex;~n"
	str :+ "uniform mat4 u_pmatrix;~n"
	str :+ "void main(void) {~n"
	str :+ "	gl_Position=u_pmatrix*vec4(vertex_pos, -1.0, 1.0);~n"
	str :+ "	v4_col=vertex_col;~n"
	str :+ "	v2_tex=vertex_uv;~n"
	str :+ "}"

	Return str

End Function

Function DefaultTextureFShaderSource:String()

	Local str:String = ""

	?linuxarm	
	str :+ "#version 100~n"
	str :+ "precision mediump float;~n"
	str :+ "uniform sampler2D u_texture0;~n"
	str :+ "varying vec4 v4_col;~n"
	str :+ "varying vec2 v2_tex;~n"
	str :+ "void main(void) {~n"
	str :+ "  vec4 tex=texture2D(u_texture0, v2_tex);~n"
	str :+ "	gl_FragColor.rgb=tex.rgb*v4_col.rgb;~n"
	str :+ "    gl_FragColor.a=tex.a*v4_col.a;~n"
	str :+ "}~n"
	?Not linuxarm
	str :+ "#version 120~n"
	str :+ "uniform sampler2D u_texture0;~n"
	str :+ "varying vec4 v4_col;~n"
	str :+ "varying vec2 v2_tex;~n"
	str :+ "void main(void) {~n"
	str :+ "    vec4 tex=texture2D(u_texture0, v2_tex);~n"
	str :+ "	gl_FragColor.rgb=tex.rgb*v4_col.rgb;~n"
	str :+ "    gl_FragColor.a=tex.a*v4_col.a;~n"
	str :+ "}~n"
	?

	Return str

End Function

Public

'============================================================================================'
'============================================================================================'

Type TGLImageFrame Extends TImageFrame

	Field u0#, v0#, u1#, v1#, uscale#, vscale#
	Field name, seq

	Method New()

		seq = GraphicsSeq

	End Method

	Method Delete()

		If Not seq Then Return
		DeleteTex( name, seq )
		seq = 0

	End Method

	Method Draw( x0#, y0#, x1#, y1#, tx#, ty#, sx#, sy#, sw#, sh# )

		Assert seq = GraphicsSeq Else "Image does not exist"

		Local u0# = sx * uscale
		Local v0# = sy * vscale
		Local u1# = ( sx + sw ) * uscale
		Local v1# = ( sy + sh ) * vscale

		_driver.DrawTexture( name, u0, v0, u1, v1, x0, y0, x1, y1, tx, ty )

	End Method
	
	Function CreateFromPixmap:TGLImageFrame( src:TPixmap, flags )

		'determine tex size
		Local tex_w = src.width
		Local tex_h = src.height
		AdjustTexSize( tex_w, tex_h )
		
		'make sure pixmap fits texture
		Local width = Min( src.width, tex_w )
		Local height = Min( src.height, tex_h )
		If src.width <> width Or src.height <> height Then src = ResizePixmap( src, width, height )

		'create texture pixmap
		Local tex:TPixmap = src
		
		'"smear" right/bottom edges if necessary
		If width < tex_w Or height < tex_h
			tex = TPixmap.Create( tex_w, tex_h, PF_RGBA8888 )
			tex.Paste( src, 0, 0 )
			If width < tex_w
				tex.Paste( src.Window( width - 1, 0, 1, height ), width, 0 )
			EndIf
			If height < tex_h
				tex.Paste( src.Window( 0, height - 1, width, 1 ), 0, height )
				If width < tex_w 
					tex.Paste( src.Window( width - 1, height - 1, 1, 1 ), width, height )
				EndIf
			EndIf
		Else
			If tex.format <> PF_RGBA8888 tex = tex.Convert( PF_RGBA8888 )
		EndIf
		
		'create tex
		Local name = CreateTex( tex_w, tex_h, flags )
		
		'upload it
		UploadTex( tex, flags )

		'clean up
		DisableTex()

		'done!
		Local frame:TGLImageFrame = New TGLImageFrame
		frame.name = name
		frame.uscale = 1.0 / tex_w
		frame.vscale = 1.0 / tex_h
		frame.u1 = width * frame.uscale
		frame.v1 = height * frame.vscale
		Return frame

	End Function

End Type

'============================================================================================'
'============================================================================================'

Type TMatrix

	Field grid:Float Ptr = Float Ptr( MemAlloc( 4 * 16 ) )
	
	Method SetOrthographic( pl:Float, pr:Float, pt:Float, pb:Float, pn:Float, pf:Float )

		LoadIdentity()
		grid[00] =  2.0 / ( pr - pl )
		grid[05] =  2.0 / ( pt - pb )
		grid[10] = -2.0 / ( pf - pn )
		grid[15] =  1.0
		grid[12] = -( ( pr + pl ) / ( pr - pl ) )
		grid[13] = -( ( pt + pb ) / ( pt - pb ) )
		grid[14] = -( ( pf + pn ) / ( pf - pn ) )

	End Method
	
	Method Clear()

		For Local i:Int = 0 To 15
			grid[i] = 0.0
		Next

	End Method
	
	Method LoadIdentity()

		Clear()
		grid[00] = 1.0
		grid[05] = 1.0
		grid[10] = 1.0
		grid[15] = 1.0

	End Method

End Type

Type TGLSLShader

	Field source:String
	Field kind:Int
	
	Field id:Int
	
	Method Create:TGLSLShader( source:Object, kind:Int )

		Self.kind = kind
		If Not Load( source ) Then Return Null
		Compile()
		If Not id Then Return Null

		Return Self

	End Method
	
	Method Load:Int( source:Object )

		If String( source ) Then
			Self.source = String( source )
			Return True
		EndIf

		Return False

	End Method

	Method Compile()
		
		If source = "" Then
			'Print "ERROR (CompileShader) No shader source!"
			Return 0
		EndIf
		
		Select kind
		Case GL_VERTEX_SHADER
			'Print "(CompileShader) Compiling vertex shader"
		Case GL_FRAGMENT_SHADER
			'Print "(CompileShader) Compiling fragment shader"
		Default 
			'Print "(CompileShader) Invalid shader type!"
			Return 0
		End Select
		
		id = glCreateShader( kind )
		Local str:Byte Ptr = source.ToCString()
		
		glShaderSource( id, 1, Varptr str, Null )
		glCompileShader( id )
		
		MemFree str
		
		Local success:Int = 0
		glGetShaderiv( id, GL_COMPILE_STATUS, Varptr success )
		
		If Not success Then
			'Print GetShaderErrorLog(id)
			Return 0
		EndIf
		
		'Print "(CompileShader) Successfully compiled shader!"
		'Return id
		
	End Method
	
	Method GetErrorLog:String( pid:Int )

		Local logsize:Int = 0
		glGetShaderiv( pid, GL_INFO_LOG_LENGTH, Varptr logsize )

		Local msg:Byte[logsize]
		Local size:Int = 0

		glGetShaderInfoLog( pid, logsize, Varptr size, Varptr msg[0] )

		Local str:String = ""
		For Local i:Int = 0 To MSG.length - 1
			str :+ Chr( msg[i] )
		Next

		Return str

	End Method
	
End Type

Type TGLSLProgram

	Field id:Int

	Field attrib_pos:Int
	Field attrib_uv:Int
	Field attrib_col:Int

	Field uniform_ProjMatrix:Int	'NOTE: Acts as glModelViewProjectionMatrix.
	Field uniform_Texture0:Int
	'Field uniform_Color:Int

	Method Create:TGLSLProgram( vs:TGLSLShader, fs:TGLSLShader )

		If glIsShader( vs.id ) = GL_FALSE Then 
			'Print "ERROR (CreateShaderProgram) pvshader is not a valid shader!"
			Return Null
		EndIf

		If glIsShader( fs.id ) = GL_FALSE Then
			'Print "ERROR (CreateShaderProgram) pfshader is not a valid shader!"
			Return Null
		EndIf

		id = glCreateProgram()
		glAttachShader( id, vs.id )
		glAttachShader( id, fs.id )
		glLinkProgram( id )
		UpdateLayout()

		Return Self
		
	End Method

	Method Validate()

		If glIsProgram( id ) = GL_FALSE Then
			'Print "ERROR (ValidateShaderProgram) Supplied id is not a shader program!"
			Return
		EndIf
		
		Local status:Int
		
		glValidateProgram( id )
		glGetProgramiv( id, GL_VALIDATE_STATUS, Varptr status )
		
		If status = GL_FALSE Then
			'Print "ERROR (ValidateShaderprogram) Supplied program is not valid! (in context)"
			Return
		EndIf
		
		Return
	
	End Method

	Method Use()

		glUseProgram( id )
		If uniform_Texture0 > -1 Then glActiveTexture( GL_TEXTURE0 )

	End Method

	Method UpdateLayout()

		If Not glIsProgram( id ) Then
			'Print "(UpdateShaderLayout) Active is not a valid shader program!"
			Return
		EndIf

		attrib_pos = glGetAttribLocation( id, "vertex_pos" )
		attrib_uv = glGetAttribLocation( id, "vertex_uv" )
		attrib_col = glGetAttribLocation( id, "vertex_col" )

		uniform_ProjMatrix = glGetUniformLocation( id, "u_pmatrix" )
		uniform_Texture0 = glGetUniformLocation( id, "u_texture0" )
		'uniform_Color = glGetUniformLocation( id, "u_color" )

	End Method

	'Method EnableData( vert_buffer:Int, uv_buffer:Int, col_buffer:Int, matrix:Float Ptr )
	Method EnableData( vert_array:Float Ptr, uv_array:Float Ptr, col_array:Float Ptr, matrix:Float Ptr )

		If attrib_pos >= 0 Then
			glEnableVertexAttribArray( attrib_pos )
			glVertexAttribPointer( attrib_pos, 2, GL_FLOAT, GL_FALSE, 0, vert_array )
		EndIf

		If attrib_uv >= 0 Then
			glEnableVertexAttribArray( attrib_uv )
			glVertexAttribPointer( attrib_uv, 2, GL_FLOAT, GL_FALSE, 0, uv_array )
		EndIf

		If attrib_col >= 0 Then
			glEnableVertexAttribArray( attrib_col )
			glVertexAttribPointer( attrib_col, 4, GL_FLOAT, GL_FALSE, 0, col_array )
		EndIf

		If uniform_ProjMatrix >= 0 Then
			glUniformMatrix4fv( uniform_ProjMatrix, 1, False, matrix )
		EndIf

		If uniform_Texture0 >= 0 Then
			glUniform1i( uniform_Texture0, 0 )
		EndIf

		'If uniform_Color >= 0 Then
		'	glUniform4f( uniform_Color, color4f[0], color4f[1], color4f[2], color4f[3] )
		'EndIf

	End Method
	
	Method DisableData()

		If attrib_pos >= 0 Then
			glDisableVertexAttribArray( attrib_pos )
		EndIf

		If attrib_uv >= 0 Then
			glDisableVertexAttribArray( attrib_uv )
		EndIf

		If attrib_col >= 0 Then
			glDisableVertexAttribArray( attrib_col )
		EndIf

	End Method

End Type

'============================================================================================'
'============================================================================================'

Type TGLMax2DDriver Extends TMax2DDriver

	Method Create:TGLMax2DDriver()

		If Not GLGraphicsDriver() Then Return Null
		Return Self

	End Method

	'graphics driver overrides
	Method GraphicsModes:TGraphicsMode[]()

		Return GLGraphicsDriver().GraphicsModes()

	End Method

	Method AttachGraphics:TMax2DGraphics( widget, flags )

		Local g:TGLGraphics=GLGraphicsDriver().AttachGraphics( widget, flags )
		If g Then Return TMax2DGraphics.Create( g, Self )

	End Method

	Method CreateGraphics:TMax2DGraphics( width, height, depth, hertz, flags )

		Local g:TGLGraphics=GLGraphicsDriver().CreateGraphics( width, height, depth, hertz, flags )
		If g Then Return TMax2DGraphics.Create( g, Self )

	End Method

	Method SetGraphics( g:TGraphics )

		If Not g
			TMax2DGraphics.ClearCurrent()
			GLGraphicsDriver().SetGraphics( Null )
			Return
		EndIf

		Local t:TMax2DGraphics = TMax2DGraphics( g )
		Assert t And TGLGraphics( t._graphics )

		GLGraphicsDriver().SetGraphics( t._graphics )
		ResetGLContext( t )
		t.MakeCurrent

	End Method

	Method ResetGLContext( g:TGraphics )

		Local gw, gh, gd, gr, gf
		g.GetSettings( gw, gh, gd, gr, gf )

		state_blend = 0
		state_boundtex = 0
		state_texenabled = 0
		glDisable( GL_TEXTURE_2D )
		glMatrixMode( GL_PROJECTION )
		glLoadIdentity()
		glOrtho( 0, gw, gh, 0, -1, 1 )
		glMatrixMode( GL_MODELVIEW )
		glLoadIdentity()
		glViewport( 0, 0, gw, gh )

	End Method

	Method Flip( sync )

		GLGraphicsDriver().Flip sync

	End Method

	Method ToString$()

		Return "OpenGL"

	End Method

	Method CreateFrameFromPixmap:TGLImageFrame( pixmap:TPixmap, flags )

		Local frame:TGLImageFrame
		frame = TGLImageFrame.CreateFromPixmap( pixmap, flags )

		Return frame

	End Method

	Method SetBlend( blend )

		If blend = state_blend Return
		state_blend = blend
		Select blend
		Case MASKBLEND
			glDisable( GL_BLEND )
			glEnable( GL_ALPHA_TEST )
			glAlphaFunc( GL_GEQUAL, 0.5 )
		Case SOLIDBLEND
			glDisable( GL_BLEND )
			glDisable( GL_ALPHA_TEST )
		Case ALPHABLEND
			glEnable( GL_BLEND )
			glBlendFunc( GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA )
			glDisable( GL_ALPHA_TEST )
		Case LIGHTBLEND
			glEnable( GL_BLEND )
			glBlendFunc( GL_SRC_ALPHA, GL_ONE )
			glDisable( GL_ALPHA_TEST )
		Case SHADEBLEND
			glEnable( GL_BLEND )
			glBlendFunc( GL_DST_COLOR, GL_ZERO )
			glDisable( GL_ALPHA_TEST )
		Default
			glDisable( GL_BLEND )
			glDisable( GL_ALPHA_TEST )
		End Select

	End Method

	Method SetAlpha( alpha# )

		If alpha > 1.0 Then alpha = 1.0
		If alpha < 0.0 Then alpha = 0.0
		color4ub[3] = alpha * 255
		glColor4ubv( color4ub )

	End Method

	Method SetLineWidth( width# )

		glLineWidth( width )

	End Method

	Method SetColor( red, green, blue )

		color4ub[0] = Min( Max( red, 0), 255 )
		color4ub[1] = Min( Max( green, 0), 255 )
		color4ub[2] = Min( Max( blue, 0), 255 )
		glColor4ubv( color4ub )

	End Method

	Method SetClsColor( red,green,blue )

		red = Min( Max( red, 0 ), 255 )
		green = Min( Max( green, 0 ), 255 )
		blue = Min( Max( blue, 0 ), 255 )
		glClearColor( red / 255.0, green / 255.0, blue / 255.0, 1.0 )

	End Method
	
	Method SetViewport( x, y, w, h )

		If x = 0 And y = 0 And w = GraphicsWidth() And h = GraphicsHeight()
			glDisable( GL_SCISSOR_TEST )
		Else
			glEnable( GL_SCISSOR_TEST )
			glScissor( x, GraphicsHeight() - y - h, w, h )
		EndIf

	End Method

	Method SetTransform( xx#, xy#, yx#, yy# )

		ix = xx
		iy = xy
		jx = yx
		jy = yy

	End Method

	Method Cls()

		glClear( GL_COLOR_BUFFER_BIT )

	End Method

	Method Plot( x#, y# )

		DisableTex()
		glBegin( GL_POINTS )
		glVertex2f( x + 0.5, y + 0.5 )
		glEnd

	End Method

	Method DrawLine( x0#, y0#, x1#, y1#, tx#, ty# )

		DisableTex()
		glBegin( GL_LINES )
		glVertex2f( x0 * ix + y0 * iy + tx + 0.5, x0 * jx + y0 * jy + ty + 0.5 )
		glVertex2f( x1 * ix + y1 * iy + tx + 0.5, x1 * jx + y1 * jy + ty + 0.5 )
		glEnd

	End Method

	Method DrawRect( x0#, y0#, x1#, y1#, tx#, ty# )

		DisableTex()
		glBegin( GL_QUADS )
		glVertex2f( x0 * ix + y0 * iy + tx, x0 * jx + y0 * jy + ty )
		glVertex2f( x1 * ix + y0 * iy + tx, x1 * jx + y0 * jy + ty )
		glVertex2f( x1 * ix + y1 * iy + tx, x1 * jx + y1 * jy + ty )
		glVertex2f( x0 * ix + y1 * iy + tx, x0 * jx + y1 * jy + ty )
		glEnd

	End Method

	Method DrawOval( x0#, y0#, x1#, y1#, tx#, ty# )

		Local xr# = ( x1 - x0 ) * 0.5
		Local yr# = ( y1 - y0 ) * 0.5
		Local segs = Abs( xr ) + Abs( yr )

		segs = Max( segs, 12 ) &~ 3

		x0 :+ xr
		y0 :+ yr

		DisableTex()
		glBegin( GL_POLYGON )
		For Local i = 0 Until segs
			Local th# = i * 360.0 / segs
			Local x# = x0 + Cos( th ) * xr
			Local y# = y0 - Sin( th ) * yr
			glVertex2f( x * ix + y * iy + tx, x * jx + y * jy + ty )
		Next
		glEnd

	End Method

	Method DrawPoly( xy#[], handle_x#, handle_y#, origin_x#, origin_y# )

		If xy.length < 6 Or ( xy.length & 1 ) Then Return

		DisableTex()
		glBegin( GL_POLYGON )
		For Local i = 0 Until Len xy Step 2
			Local x# = xy[i + 0] + handle_x
			Local y# = xy[i + 1] + handle_y
			glVertex2f( x * ix + y * iy + origin_x, x * jx + y * jy + origin_y )
		Next
		glEnd

	End Method

	Method DrawPixmap( p:TPixmap, x, y )

		Local blend = state_blend
		DisableTex()
		SetBlend( SOLIDBLEND )

		Local t:TPixmap = p
		If t.format <> PF_RGBA8888 t = ConvertPixmap( t, PF_RGBA8888 )

		glPixelZoom( 1, -1 )
		glRasterPos2i( 0, 0 )
		glBitmap( 0, 0, 0, 0, x, -y, Null )
		glPixelStorei( GL_UNPACK_ROW_LENGTH, t.pitch Shr 2 )
		glDrawPixels( t.WIDTH, t.HEIGHT, GL_RGBA, GL_UNSIGNED_BYTE, t.pixels )
		glPixelStorei( GL_UNPACK_ROW_LENGTH, 0 )
		glPixelZoom( 1, 1 )

		SetBlend( blend )

	End Method

	Method DrawTexture( name, u0#, v0#, u1#, v1#, x0#, y0#, x1#, y1#, tx#, ty# )

		EnableTex( name )

		glBegin( GL_QUADS )
		glTexCoord2f( u0, v0 )
		glVertex2f( x0 * ix + y0 * iy + tx, x0 * jx + y0 * jy + ty )
		glTexCoord2f( u1, v0 )
		glVertex2f( x1 * ix + y0 * iy + tx, x1 * jx + y0 * jy + ty )
		glTexCoord2f( u1, v1 )
		glVertex2f( x1 * ix + y1 * iy + tx, x1 * jx + y1 * jy + ty )
		glTexCoord2f( u0, v1 )
		glVertex2f( x0 * ix + y1 * iy + tx, x0 * jx + y1 * jy + ty )
		glEnd

		DisableTex()

	End Method

	Method GrabPixmap:TPixmap( x, y, w, h )

		Local blend = state_blend
		SetBlend( SOLIDBLEND )
		Local p:TPixmap = CreatePixmap( w, h, PF_RGBA8888 )
		glReadPixels( x, GraphicsHeight() - h - y, w, h, GL_RGBA, GL_UNSIGNED_BYTE, p.pixels )
		p = YFlipPixmap( p )
		SetBlend( blend )
		Return p

	End Method

	Method SetResolution( width#, height# )

		glMatrixMode( GL_PROJECTION )
		glLoadIdentity()
		glOrtho( 0, width, height, 0, -1, 1 )
		glMatrixMode( GL_MODELVIEW )

	End Method

End Type

Type TGL2Max2DDriver Extends TGLMax2DDriver 'TMax2DDriver

	Const BATCHSIZE:Int = 65536 ' how many entries that can be stored in batch before a draw call is required

	' has driver been initialized?

	Field inited:Int

	' pre-built element arrays

	Global TRI_INDS[BATCHSIZE * 3]
	Global QUAD_INDS[BATCHSIZE * 6]

	' vertex attribute arrays

	Field vert_array:Float Ptr = Float Ptr( MemAlloc( 4 * BATCHSIZE * 3 ) )
	Field uv_array:Float Ptr = Float Ptr( MemAlloc( 4 * BATCHSIZE * 2 ) )
	Field col_array:Float Ptr = Float Ptr( MemAlloc( 4 * BATCHSIZE * 4 ) )

	' constants for primitive_id rendering

	Const PRIMITIVE_PLAIN_TRIANGLE:Int = 1
	Const PRIMITIVE_DOT:Int = 2
	Const PRIMITIVE_LINE:Int = 3
	Const PRIMITIVE_IMAGE:Int = 4
	Const PRIMITIVE_TRIANGLE_FAN:Int = 5
	Const PRIMITIVE_TRIANGLE_STRIP:Int = 6
	Const PRIMITIVE_TEXTURED_TRIANGLE:Int = 7

	' variables for tracking

	Field vert_index:Int
	Field quad_index:Int
	Field primitive_id:Int
	Field texture_id:Int
	Field blend_id:Int
'	Field element_array:Int[BATCHSIZE * 2]
'	Field element_index:Int
'	Field vert_buffer:Int
'	Field uv_buffer:Int
'	Field col_buffer:Int
'	Field element_buffer:Int

	' projection matrix

	Field u_pmatrix:TMatrix

	' current shader program and defaults

	Field activeProgram:TGLSLProgram
	Field defaultVShader:TGLSLShader
	Field defaultFShader:TGLSLShader
	Field defaultProgram:TGLSLProgram
	Field defaultTextureVShader:TGLSLShader
	Field defaultTextureFShader:TGLSLShader
	Field defaultTextureProgram:TGLSLProgram

	' current z layer for drawing (NOT USED)

	Field layer:Float

	Method Create:TGL2Max2DDriver()

		?Not linuxarm
		If Not GLGraphicsDriver() Then Return Null
		?linuxarm
		If Not EGLGraphicsDriver() Then Return Null
		?
		Return Self

	End Method

	'graphics driver overrides
	Method GraphicsModes:TGraphicsMode[]()

		?Not linuxarm
		Return GLGraphicsDriver().GraphicsModes()
		?linuxarm
		Return EGLGraphicsDriver().GraphicsModes()
		?

	End Method

	Method AttachGraphics:TMax2DGraphics( widget, flags )

		Local g:TGLGraphics = Null
		?Not linuxarm
		g = GLGraphicsDriver().AttachGraphics( widget, flags )
		?linuxarm
		g = EGLGraphicsDriver().AttachGraphics( widget, flags )
		?
		If g Then Return TMax2DGraphics.Create( g, Self )

	End Method
	
	Method CreateGraphics:TMax2DGraphics( width, height, depth, hertz, flags )

		Local g:TGLGraphics = Null

		?Not linuxarm
		g = GLGraphicsDriver().CreateGraphics( width, height, depth, hertz, flags )
		?linuxarm
		g = EGLGraphicsDriver().CreateGraphics( width, height, depth, hertz, flags )
		?
		If g Then Return TMax2DGraphics.Create( g, Self )

	End Method

	Method SetGraphics( g:TGraphics )

		If Not g
			TMax2DGraphics.ClearCurrent
			?Not linuxarm
			GLGraphicsDriver().SetGraphics Null
			?linuxarm
			EGLGraphicsDriver().SetGraphics Null
			?
			Return
		EndIf

		Local t:TMax2DGraphics = TMax2DGraphics( g )
		?Not linuxarm
		Assert t And TGLGraphics( t._graphics )
		?

		?Not linuxarm
		GLGraphicsDriver().SetGraphics t._graphics
		?linuxarm
		EGLGraphicsDriver().SetGraphics t._graphics
		?
		ResetGLContext t
		t.MakeCurrent

	End Method
	
	Method ResetGLContext( g:TGraphics )

		Local gw, gh, gd, gr, gf
		g.GetSettings( gw, gh, gd, gr, gf )

		If Not inited Then
			Init()
			inited = True
		End If

		state_blend = 0
		state_boundtex = 0
		state_texenabled = 0
		glDisable( GL_TEXTURE_2D )

		'glMatrixMode( GL_PROJECTION )
		'glLoadIdentity()
		'glOrtho( 0, gw, gh, 0, -1, 1 )
		'glMatrixMode( GL_MODELVIEW )
		'glLoadIdentity()
		'glViewport( 0, 0, gw, gh )

		u_pmatrix = New TMatrix
		u_pmatrix.SetOrthographic( 0, gw, 0, gh, -1, 1 )

	End Method
	
	Method Flip( sync )

		Flush()
		?Not linuxarm
		GLGraphicsDriver().Flip sync
		?linuxarm
		EGLGraphicsDriver().Flip sync
		?

	End Method

	Method ToString$()

		Return "OpenGL"

	End Method

	Method CreateFrameFromPixmap:TGLImageFrame( pixmap:TPixmap, flags )

		Local frame:TGLImageFrame
		frame = TGLImageFrame.CreateFromPixmap( pixmap, flags )
		Return frame

	End Method

	Method SetBlend( blend )

		state_blend = blend

	End Method

	Method SetAlpha( alpha# )

		If alpha > 1.0 Then alpha = 1.0
		If alpha < 0.0 Then alpha = 0.0
		color4f[3] = alpha

	End Method

	Method SetLineWidth( width# )

		glLineWidth( width )

	End Method

	Method SetColor( red, green, blue )

		color4f[0] = Min( Max( red, 0 ), 255 ) / 255.0
		color4f[1] = Min( Max( green, 0 ), 255 ) / 255.0
		color4f[2] = Min( Max( blue, 0 ), 255 ) / 255.0

	End Method

	Method SetClsColor( red, green, blue )

		red = Min( Max( red, 0 ), 255 )
		green = Min( Max( green, 0 ), 255 )
		blue = Min( Max( blue, 0 ), 255 )
		glClearColor( red / 255.0, green / 255.0, blue / 255.0, 1.0 )

	End Method
	
	Method SetViewport( x, y, w, h )

		If x = 0 And y = 0 And w = GraphicsWidth() And h = GraphicsHeight()
			glDisable( GL_SCISSOR_TEST )
		Else
			glEnable( GL_SCISSOR_TEST )
			glScissor( x, GraphicsHeight() - y - h, w, h )
		EndIf

	End Method

	Method SetTransform( xx#, xy#, yx#, yy# )

		ix = xx
		iy = xy
		jx = yx
		jy = yy

	End Method

	Method Cls()

		glClear( GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT )

	End Method

	Method Plot( px#, py# )

		FlushTest( PRIMITIVE_DOT )

		Local in:Int = vert_index * 2

		vert_array[in + 0] = px
		vert_array[in + 1] = py

		in = vert_index * 4

		col_array[in + 0] = color4f[0] 'red
		col_array[in + 1] = color4f[1] 'green
		col_array[in + 2] = color4f[2] 'blue
		col_array[in + 3] = color4f[3] 'alpha

		vert_index :+ 1

	End Method

	Method DrawLine( x0#, y0#, x1#, y1#, tx#, ty# )

		FlushTest( PRIMITIVE_LINE )

		Local in:Int = vert_index * 2

		vert_array[in + 0] = x0 * ix + y0 * iy + tx + 0.5
		vert_array[in + 1] = x0 * jx + y0 * jy + ty + 0.5

		vert_array[in + 2] = x1 * ix + y1 * iy + tx + 0.5
		vert_array[in + 3] = x1 * jx + y1 * jy + ty + 0.5

		in = vert_index * 4

		col_array[in + 0] = color4f[0] 'red
		col_array[in + 1] = color4f[1] 'green
		col_array[in + 2] = color4f[2] 'blue
		col_array[in + 3] = color4f[3] 'alpha

		col_array[in + 4] = color4f[0] 'red
		col_array[in + 5] = color4f[1] 'green
		col_array[in + 6] = color4f[2] 'blue
		col_array[in + 7] = color4f[3] 'alpha

		vert_index :+ 2

	End Method

	Method DrawRect( x0#, y0#, x1#, y1#, tx#, ty# )

		FlushTest( PRIMITIVE_PLAIN_TRIANGLE )

		Local in:Int = vert_index * 2

		vert_array[in    ] = x0 * ix + y0 * iy + tx		'topleft x
		vert_array[in + 1] = x0 * jx + y0 * jy + ty		'topleft y
		vert_array[in + 2] = x1 * ix + y0 * iy + tx		'topright x
		vert_array[in + 3] = x1 * jx + y0 * jy + ty		'topright y
		vert_array[in + 4] = x1 * ix + y1 * iy + tx		'bottomright x
		vert_array[in + 5] = x1 * jx + y1 * jy + ty		'bottomright x
		vert_array[in + 6] = x0 * ix + y1 * iy + tx		'bottomleft x
		vert_array[in + 7] = x0 * jx + y1 * jy + ty		'bottomleft y

		in = vert_index * 4

		col_array[in + 00] = color4f[0] 'red
		col_array[in + 01] = color4f[1] 'green
		col_array[in + 02] = color4f[2] 'blue
		col_array[in + 03] = color4f[3] 'alpha

		col_array[in + 04] = color4f[0] 'red
		col_array[in + 05] = color4f[1] 'green
		col_array[in + 06] = color4f[2] 'blue
		col_array[in + 07] = color4f[3] 'alpha

		col_array[in + 08] = color4f[0] 'red
		col_array[in + 09] = color4f[1] 'green
		col_array[in + 10] = color4f[2] 'blue
		col_array[in + 11] = color4f[3] 'alpha

		col_array[in + 12] = color4f[0] 'red
		col_array[in + 13] = color4f[1] 'green
		col_array[in + 14] = color4f[2] 'blue
		col_array[in + 15] = color4f[3] 'alpha

		vert_index :+ 4
		quad_index :+ 1

	End Method

	Method DrawOval( x0#, y0#, x1#, y1#, tx#, ty# )

		' TRIANGLE_FAN (no batching)
		FlushTest( PRIMITIVE_TRIANGLE_FAN )

		Local xr# = ( x1 - x0 ) * 0.5
		Local yr# = ( y1 - y0 ) * 0.5
		Local segs = Abs( xr ) + Abs( yr )

		segs = Max( segs, 12 ) &~ 3

		x0 :+ xr
		y0 :+ yr

		Local in:Int = vert_index * 2

		vert_array[in    ] = x0 * ix + y0 * iy + tx
		vert_array[in + 1] = x0 * jx + y0 * jy + ty

		Local off:Int = 2

		For Local i = 0 To segs
			Local th# = i * 360# / segs
			Local x# = x0 + Cos( th ) * xr
			Local y# = y0 - Sin( th ) * yr
			vert_array[in + off    ] = x * ix + y * iy + tx
			vert_array[in + off + 1] = x * jx + y * jy + ty
			off :+ 2
		Next

		in = vert_index * 4

		col_array[in + 0] = color4f[0] 'red
		col_array[in + 1] = color4f[1] 'green
		col_array[in + 2] = color4f[2] 'blue
		col_array[in + 3] = color4f[3] 'alpha

		off = 4

		For Local i = 0 To segs
			col_array[in + off + 0] = color4f[0] 'red
			col_array[in + off + 1] = color4f[1] 'green
			col_array[in + off + 2] = color4f[2] 'blue
			col_array[in + off + 3] = color4f[3] 'alpha
			off :+ 4
		Next

		vert_index :+ segs + 2

	End Method

	Method DrawPoly( xy#[], handle_x#, handle_y#, origin_x#, origin_y# )

		If xy.length < 6 Or ( xy.length & 1 ) Then Return

		' TRIANGLE_FAN (no batching)
		FlushTest( PRIMITIVE_TRIANGLE_FAN )

		Local in:Int = vert_index * 2

		For Local i = 0 Until xy.length Step 2
			Local x# = handle_x + xy[i]
			Local y# = handle_y + xy[i + 1]
			vert_array[in + i    ] = x * ix + y * iy + origin_x
			vert_array[in + i + 1] = x * jx + y * jy + origin_y
		Next

		in = vert_index * 4

		For Local i = 0 Until xy.length / 2
			col_array[in + i * 4    ] = color4f[0] 'red
			col_array[in + i * 4 + 1] = color4f[1] 'green
			col_array[in + i * 4 + 2] = color4f[2] 'blue
			col_array[in + i * 4 + 3] = color4f[3] 'alpha
		Next

		vert_index :+ xy.length / 2

	End Method

	Method DrawPixmap( p:TPixmap, x, y )

		Local blend = state_blend
		SetBlend( SOLIDBLEND )

		Local t:TPixmap = p
		If t.format <> PF_RGBA8888 Then t = ConvertPixmap( t, PF_RGBA8888 )

		'glPixelZoom( 1, -1 )
		'glRasterPos2i( 0, 0 )
		'glBitmap( 0, 0, 0, 0, x, -y, Null )
		'glPixelStorei( GL_UNPACK_ROW_LENGTH, t.pitch Shr 2 )
		'glDrawPixels( t.WIDTH, t.HEIGHT, GL_RGBA, GL_UNSIGNED_BYTE, t.pixels )
		'glPixelStorei( GL_UNPACK_ROW_LENGTH, 0 )
		'glPixelZoom( 1, 1 )

		SetBlend( blend )

	End Method

	Method DrawTexture( name, u0#, v0#, u1#, v1#, x0#, y0#, x1#, y1#, tx#, ty# )

		FlushTest( PRIMITIVE_TEXTURED_TRIANGLE, name )

		Local in:Int = vert_index * 2

		uv_array[in    ] = u0		'topleft x
		uv_array[in + 1] = v0		'topleft y
		uv_array[in + 2] = u1		'topright x
		uv_array[in + 3] = v0		'topright y
		uv_array[in + 4] = u1		'bottomright x
		uv_array[in + 5] = v1		'bottomright y
		uv_array[in + 6] = u0		'bottomleft x
		uv_array[in + 7] = v1		'bottomleft y

		vert_array[in    ] = x0 * ix + y0 * iy + tx		'topleft x
		vert_array[in + 1] = x0 * jx + y0 * jy + ty		'topleft y
		vert_array[in + 2] = x1 * ix + y0 * iy + tx		'topright x
		vert_array[in + 3] = x1 * jx + y0 * jy + ty		'topright y
		vert_array[in + 4] = x1 * ix + y1 * iy + tx		'bottomright x
		vert_array[in + 5] = x1 * jx + y1 * jy + ty		'bottomright x
		vert_array[in + 6] = x0 * ix + y1 * iy + tx		'bottomleft x
		vert_array[in + 7] = x0 * jx + y1 * jy + ty		'bottomleft y

		in = vert_index * 4

		col_array[in + 00] = color4f[0] 'red
		col_array[in + 01] = color4f[1] 'green
		col_array[in + 02] = color4f[2] 'blue
		col_array[in + 03] = color4f[3] 'alpha

		col_array[in + 04] = color4f[0] 'red
		col_array[in + 05] = color4f[1] 'green
		col_array[in + 06] = color4f[2] 'blue
		col_array[in + 07] = color4f[3] 'alpha

		col_array[in + 08] = color4f[0] 'red
		col_array[in + 09] = color4f[1] 'green
		col_array[in + 10] = color4f[2] 'blue
		col_array[in + 11] = color4f[3] 'alpha

		col_array[in + 12] = color4f[0] 'red
		col_array[in + 13] = color4f[1] 'green
		col_array[in + 14] = color4f[2] 'blue
		col_array[in + 15] = color4f[3] 'alpha

		vert_index :+ 4
		quad_index :+ 1

	End Method

	Method GrabPixmap:TPixmap( x, y, w, h )

		Local blend = state_blend
		SetBlend( SOLIDBLEND )
		Local p:TPixmap = CreatePixmap( w, h, PF_RGBA8888 )
		glReadPixels( x, GraphicsHeight() - h - y, w, h, GL_RGBA, GL_UNSIGNED_BYTE, p.pixels )
		p = YFlipPixmap( p )
		SetBlend( blend )
		Return p

	End Method

	Method SetResolution( width#, height# )

		'u_pmatrix.SetOrthographic( 0, width, 0, height, -1, 1 )
		glMatrixMode( GL_PROJECTION )
		glLoadIdentity()
		glOrtho( 0, width, height, 0, -1, 1 )
		glMatrixMode( GL_MODELVIEW )

	End Method

	Method Init()

		?Not linuxarm
		glewinit()
		?

		color4f[0] = 1.0
		color4f[1] = 1.0
		color4f[2] = 1.0
		color4f[3] = 1.0

		For Local i = 0 Until BATCHSIZE
			Local in = i * 3
			TRI_INDS[in    ] = in
			TRI_INDS[in + 1] = in + 1
			TRI_INDS[in + 2] = in + 2
		Next
		For Local i:Int = 0 Until BATCHSIZE
			Local i4 = i * 4
			Local i6 = i * 6
			QUAD_INDS[i6    ] = i4
			QUAD_INDS[i6 + 1] = i4 + 1
			QUAD_INDS[i6 + 2] = i4 + 2
			QUAD_INDS[i6 + 3] = i4 + 2
			QUAD_INDS[i6 + 4] = i4 + 3
			QUAD_INDS[i6 + 5] = i4
		Next

		' set up shaders
		defaultVShader = New TGLSLShader.Create( DefaultVShaderSource(), GL_VERTEX_SHADER )
		defaultFShader = New TGLSLShader.Create( DefaultFShaderSource(), GL_FRAGMENT_SHADER )
		defaultProgram = New TGLSLProgram.Create( defaultVShader, defaultFShader )

		defaultTextureVShader = New TGLSLShader.Create( DefaultTextureVShaderSource(), GL_VERTEX_SHADER )
		defaultTextureFShader = New TGLSLShader.Create( DefaultTextureFShaderSource(), GL_FRAGMENT_SHADER )
		defaultTextureProgram = New TGLSLProgram.Create( defaultTextureVShader, defaultTextureFShader )

		vert_index = 0
		quad_index = 0
		primitive_id = 0
		texture_id = -1
		blend_id = SOLIDBLEND

	End Method

	Method FlushTest( prim_id:Int, tex_id:Int = -1 )

		Select primitive_id
		Case PRIMITIVE_TRIANGLE_FAN, PRIMITIVE_TRIANGLE_STRIP	'Always flush...
			Flush()

		Default
			If prim_id <> primitive_id Or ..
			vert_index > BATCHSIZE - 256 Or ..
			state_blend <> blend_id Or ..
			tex_id <> texture_id Then
				Flush()
			EndIf

		End Select
		primitive_id = prim_id
		texture_id = tex_id
		blend_id = state_blend

	End Method
	
	Method Flush()

		Select primitive_id
		Case PRIMITIVE_PLAIN_TRIANGLE
			If quad_index = 0 Then Return
			If activeProgram <> defaultProgram Then
				activeProgram = defaultProgram
				activeProgram.Use()
			EndIf
		Case PRIMITIVE_TEXTURED_TRIANGLE
			If quad_index = 0 Then Return
			If activeProgram <> defaultTextureProgram
				activeProgram = defaultTextureProgram
				activeProgram.Use()
			EndIf
		Case PRIMITIVE_DOT, PRIMITIVE_LINE, PRIMITIVE_TRIANGLE_FAN, PRIMITIVE_TRIANGLE_STRIP
			If vert_index = 0 Then Return
			If activeProgram <> defaultProgram Then
				activeProgram = defaultProgram
				activeProgram.Use()
			EndIf
		Default
			Return
		End Select

		If activeProgram Then
			
			' additional tests. validate shaderprogram and buffer. shader program validation takes
			' context into consideration, so do it right before drawing
			
			' NOTE: This should probably happen, but not on every Flush().
			'activeProgram.Validate()
			
			' somewhat interesting? default framebuffer should not return any errors
			' NOTE: 36062 seems to be an erroneous error code (ie opengl returns something it shouldnt)
			'Local status:Int = glCheckFramebufferStatus( GL_FRAMEBUFFER )
			'Select status
			'Case GL_FRAMEBUFFER_COMPLETE
				'Print "valid framebuffer"
			'Default
				'Print "status: " + status
			'End Select

			activeProgram.EnableData( vert_array, uv_array, col_array, u_pmatrix.grid )

			Select blend_id
			?Not linuxarm
			Case MASKBLEND
				glDisable( GL_BLEND )
				glEnable( GL_ALPHA_TEST )
				glAlphaFunc( GL_GEQUAL, 0.5 )
			?
			Case SOLIDBLEND
				glDisable( GL_BLEND )
				?Not linuxarm
				glDisable( GL_ALPHA_TEST )
				?
			Case ALPHABLEND
				glEnable( GL_BLEND )
				glBlendFunc( GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA )
				?Not linuxarm
				glDisable( GL_ALPHA_TEST )
				?
			Case LIGHTBLEND
				glEnable( GL_BLEND )
				glBlendFunc( GL_SRC_ALPHA, GL_ONE )
				?Not linuxarm
				glDisable( GL_ALPHA_TEST )
				?
			Case SHADEBLEND
				glEnable( GL_BLEND )
				glBlendFunc( GL_DST_COLOR, GL_ZERO )
				?Not linuxarm
				glDisable( GL_ALPHA_TEST )
				?
			Default
				glDisable( GL_BLEND )
				?Not linuxarm
				glDisable( GL_ALPHA_TEST )
				?
			End Select

			Select primitive_id
			Case PRIMITIVE_PLAIN_TRIANGLE
				glDrawElements( GL_TRIANGLES, quad_index * 6, GL_UNSIGNED_INT, QUAD_INDS )
			Case PRIMITIVE_TEXTURED_TRIANGLE
				EnableTex( texture_id )
				glDrawElements( GL_TRIANGLES, quad_index * 6, GL_UNSIGNED_INT, QUAD_INDS )
				DisableTex()
			Case PRIMITIVE_DOT
				glDrawArrays( GL_POINTS, 0, vert_index )
			Case PRIMITIVE_LINE
				glDrawArrays( GL_LINES, 0, vert_index )
			Case PRIMITIVE_TRIANGLE_FAN
				glDrawArrays( GL_TRIANGLE_FAN, 0, vert_index )
			Case PRIMITIVE_TRIANGLE_STRIP
				glDrawArrays( GL_TRIANGLE_STRIP, 0, vert_index )
			End Select
			
			activeProgram.DisableData()
			glUseProgram( 0 )
			activeProgram = Null
		End If

		vert_index = 0
		quad_index = 0

	End Method

	'NOTE: Unnecessary, for the time being.
'	Method UpdateBuffers()
'
'		If vert_buffer = 0 Then glGenBuffers( 1, Varptr vert_buffer )
'		If uv_buffer = 0 Then glGenBuffers( 1, Varptr uv_buffer )
'		If col_buffer = 0 Then glGenBuffers( 1, Varptr col_buffer )
'		If element_buffer = 0 Then glGenBuffers( 1, Varptr element_buffer )
'
'		glBindBuffer( GL_ARRAY_BUFFER, vert_buffer )
'		glBufferData( GL_ARRAY_BUFFER, vert_index * 12, vert_array, GL_DYNAMIC_DRAW )
'
'		glBindBuffer( GL_ARRAY_BUFFER, uv_buffer)
'		glBufferData( GL_ARRAY_BUFFER, vert_index * 8, uv_array, GL_DYNAMIC_DRAW )
'
'		glBindBuffer( GL_ARRAY_BUFFER, col_buffer )
'		glBufferData( GL_ARRAY_BUFFER, vert_index * 16, col_array, GL_DYNAMIC_DRAW )
'
'		glBindBuffer( GL_ELEMENT_ARRAY_BUFFER, element_buffer)
'		glBufferData( GL_ELEMENT_ARRAY_BUFFER, element_index * 12, element_array, GL_DYNAMIC_DRAW )
'
'	End Method

End Type

Rem
bbdoc: Get OpenGL Max2D Driver
about:
The returned driver can be used with #SetGraphicsDriver to enable OpenGL Max2D rendering.
End Rem
Function GL2Max2DDriver:TGLMax2DDriver()
	Print "GL2 (with shaders) Active"
	Global _done
	If Not _done
		_driver = New TGL2Max2DDriver.Create()
		_done = True
	EndIf
	Return _driver
End Function

Function GLMax2DDriver:TGLMax2DDriver()
	Print "GL (fixed function) Active"
	Global _done
	If Not _done
		_driver = New TGLMax2DDriver.Create()
		_done = True
	EndIf
	Return _driver
End Function

Local driver:TGLMax2DDriver = Null
If GLMAX2D_USE_LEGACY Then driver = GLMax2DDriver() Else driver = GL2Max2DDriver()
If driver SetGraphicsDriver driver

A simple test program.

Strict

Framework brl.GL2Max2D
Import brl.PNGLoader

Graphics( 1024, 768, 32 )

While Not KeyDown( KEY_ESCAPE )

	Cls()

	SetColor( 255, 255, 0 )
	DrawLine( 130, 180, 850, 570 )
	DrawCross( 100, 100, 5 )

	SetColor( 255, 0, 0 )
	DrawText( "Hello", 500, 300 )
	SetColor( 255, 255, 255 )
	DrawText( "Hello", 500, 315 )
	SetColor( 0, 0, 255 )
	DrawText( "Hello", 500, 330 )

	SetColor( 255, 0, 255 )
	DrawRect( 490, 350, 100, 80 )
	SetColor( 0, 255, 0 )
	DrawRect( 540, 390, 100, 80 )

	DrawOval( 200, 200, 100, 150 )
	DrawPoly( [750#, 400#, 850#, 400#, 860#, 500#, 760#, 500#] )

	Flip

Wend

Function DrawCross( x#, y#, size = 2 )
	Plot( x, y )
	For Local i = 1 To size
		Plot( x + i, y )
		Plot( x - i, y )
		Plot( x, y + i )
		Plot( x, y - i )
	Next
End Function

juankprada

(Posted 2014) [#33]

Just tested it in my own game. It works right now, but it is very very slow. I'll be looking at the src and see where can it be optimized, Thanks LT

LT	(Posted 2014) [#34]

Hmm, DrawImage() is not using batches right now, so that could be changed easily enough by switching to PLAIN_TRIANGLE. However, it would have to keep track of the texture id and use that as a Flush() criterion. Currently, extra vertices are being sent to the card to provide color information. Another option is to remove that and send a Uniform color, but that also would have to be used as a Flush() criterion.

Digesteroids runs quite fast on my fairly old computer (albeit with a pretty fast graphics card). How does it run for you?

Brucey

(Posted 2014) [#35]

Would it be better if there were different sets of arrays/shaders/etc for the different things to draw things, which could just be batched together as required, and then on the Flip, all pushed through together by running the different shader programs?

Perhaps some different Objects that collate everything during all the draws?

Although I suppose draw order would be a problem then... hmm.

Derron

(Posted 2014) [#36]

Maybe have a look hoe cocos 2d etc do it with their new render pipelines...and how they order draw calls.

Id observation will be a first step in optimization...at least the code will be useable to test bmx ng on other platforms.

Bye
Ron

Brucey

(Posted 2014) [#37]

at least the code will be useable to test bmx ng on other platforms

That would depend on what platforms you are talking about.
For example, GL_UNPACK_ROW_LENGTH is not supported on OpenGLES 2, so one would need a different UploadTex() implementation to start with.

LT	(Posted 2014) [#38]

I suppose draw order would be a problem then

Priority numbers (or layer) could be another criterion. The Flush needs to be separate from Flip so that it will play nicely with other renderers.

Making sure DrawImage uses batches will probably help. How often does anyone use plain colored rectangles and polygons in conjunction with textured sprites? Not often, I'll bet. My main concern is how this will run on mobile devices, for which I've heard that draw calls are rather costly.

It would nice to know how Digesteroids performs on various devices.

Derron

(Posted 2014) [#39]

See above about unsupported commands in egl. you might need to substitute unpackrows with other approaches (there are many answers for this on stackoverflow).

Bye
Ron

LT	(Posted 2014) [#40]

so one would need a different UploadTex() implementation

I've yet to find a comprehensive resource that tells me exactly what is and what is not available for specific GL versions. I left that function as is, since I have no way to test alternatives.

Brucey

(Posted 2014) [#41]

Fairly comprehensive :

https://github.com/bmx-ng/pub.mod/blob/master/opengles.mod/extern.bmx

:-)

LT	(Posted 2014) [#42]

Well, that reference is handy, thanks!

Even after reading about GL_UNPACK_ROW_LENGTH (unavailable) and GL_UNPACK_ALIGNMENT (available), their purpose is still sort of unclear to me. I found that using one or the other made no difference, or even removing them altogether. :/

Derron

(Posted 2014) [#43]

Think as soon as you have to use "glPixelStoreX" you will need that function.

Eg. for getting portions of a texture ("subtexturing")
http://stackoverflow.com/questions/205522/opengl-subtexturing

Another one:
http://stackoverflow.com/questions/9483945/looking-for-alternative-to-gltexsubimage2d-with-data-offset-support

So this might be used for sprite atlases or so?

bye
Ron

LT	(Posted 2014) [#44]

No, it's not for sprite atlases - it's more low-level than that. It affects functions like glReadPixels, but how is not entirely clear to me. Using one, the other, or neither makes no difference on my machine. My guess is that it has an effect on performance, though.

It occurs to me that the texture load functionality on mobile platforms should probably just use SDL's texture functions.

EDIT: The textures are already powers of two, might have something to do with why it SEEMS to work regardless.

zzz	(Posted 2014) [#45]

Looks like its for defining row sizes that differs from image sizes. Ie same as pixmap pitch. So pow2 textures probably happened to match whatever increments the pixmap uses for row length.
EDIT: Row length would be offset from first pixel in one row to first pixel in the next row, and alignment would be for what alignment (1,2,4,8) the first pixel in the next row will be at.

LT	(Posted 2014) [#46]

This might work as a replacement for UploadTex and GL_UNPACK_ROW_LENGTH.

Function UploadTex( pixmap:TPixmap, flags )

	Local mip_level
	Repeat
		?linuxarm
		glTexImage2D( GL_TEXTURE_2D, 0, GL_RGBA, pixmap.width, pixmap.height, 0, GL_RGBA, GL_UNSIGNED_BYTE, Null )
		For Local y = 0 Until pixmap.height
			Local row:Byte Ptr = pixmap.pixels + ( y * pixmap.width ) * 4
			glTexSubImage2D( GL_TEXTURE_2D, 0, 0, y, pixmap.width, 1, GL_RGBA, GL_UNSIGNED_BYTE, row )
		Next		
		?Not linuxarm
		glPixelStorei( GL_UNPACK_ROW_LENGTH, pixmap.pitch / BytesPerPixel[pixmap.format] )
		glTexSubImage2D( GL_TEXTURE_2D, mip_level, 0, 0, pixmap.width, pixmap.height, GL_RGBA, GL_UNSIGNED_BYTE, pixmap.pixels )
		?

		If Not ( flags & MIPMAPPEDIMAGE ) Exit
		If pixmap.width > 1 And pixmap.height > 1
			pixmap = ResizePixmap( pixmap, pixmap.width / 2, pixmap.height / 2 )
		Else If pixmap.width > 1
			pixmap = ResizePixmap( pixmap, pixmap.width / 2, pixmap.height )
		Else If pixmap.height > 1
			pixmap = ResizePixmap( pixmap, pixmap.width, pixmap.height / 2 )
		Else
			Exit
		EndIf
		mip_level :+ 1
	Forever
	?Not linuxarm
	glPixelStorei( GL_UNPACK_ROW_LENGTH, 0 )
	?

End Function

EDIT: The version with UNPACK_ROW_LENGTH should only be used with higher GL versions. Probably should remove it from this module.

zzz	(Posted 2014) [#47]

There are a few ways to (theoretically) improve rendering speed for quads. Either using index resets to batch draw quads using triangle strips, which should result in faster drawing on the gpu, or using a premade elements array to reduce data transfer. I havent tried either of them, so I have no idea if its actually worth implementing or not.

EDIT: Or just utilize both.. Interesting enough to test it out I guess :) (Apparently ES wont do restart indices, but using drawelements gave about 15% increased performance for quad rendering on my system)

LT	(Posted 2014) [#48]

using index resets to batch draw quads using triangle strips

That's what it is doing now; results in a draw call per image.

GaryV

(Posted 2014) [#49]

I look forward to seeing what you turn out. I am sure it will be a great addition.

zzz	(Posted 2014) [#50]

What I meant was using the glPrimitiveRestartIndex along with whatever else it requires to set it up. It would basically allow you to draw a lot of triangle strip quads with a single glDrawElements call. The ES version Brucey want to support wont do it though.

I put the test code for the second suggestion in the wrong mod, but its straightforward enough. I made a completely separate set of arrays, but besides the elements array its probably better to just reuse the ones the batcher in gl2max2d already uses.


	Global quad_vert:Float[ BUFFER_SIZE * 4 * 2 ]
	Global quad_uv:Float[ BUFFER_SIZE * 4 * 2 ]
	Global quad_col:Float[ BUFFER_SIZE * 4 * 4 ]
	Global quad_element:Int[ BUFFER_SIZE * 6 ]
	
	Global quad_index:Int = 0
	Global elem_index:Int = 0
	
	Function InitQuadArrays()
	
		' link up vertices forming separate quads. 
		
		For Local i:Int = 0 Until ( quad_element.length / 6 )
		
			quad_element[ i * 6 + 0 ] = i * 4 + 0
			quad_element[ i * 6 + 1 ] = i * 4 + 1
			quad_element[ i * 6 + 2 ] = i * 4 + 2
			
			quad_element[ i * 6 + 3 ] = i * 4 + 2
			quad_element[ i * 6 + 4 ] = i * 4 + 3
			quad_element[ i * 6 + 5 ] = i * 4 + 0
		
		Next
		
		Return
	
	End Function
	
	Function AddQuad( x1:Float, y1:Float, x2:Float, y2:Float, x3:Float, y3:Float, x4:Float, y4:Float )
	
		quad_vert[ quad_index * 2 + 0 ] = x1
		quad_vert[ quad_index * 2 + 1 ] = y1
		
		quad_col[ quad_index * 4 + 0 ] = rgba[ 0 ]
		quad_col[ quad_index * 4 + 1 ] = rgba[ 1 ]
		quad_col[ quad_index * 4 + 2 ] = rgba[ 2 ]
		quad_col[ quad_index * 4 + 3 ] = rgba[ 3 ]
		
		quad_index :+ 1
		
		quad_vert[ quad_index * 2 + 0 ] = x2
		quad_vert[ quad_index * 2 + 1 ] = y2
		
		quad_col[ quad_index * 4 + 0 ] = rgba[ 0 ]
		quad_col[ quad_index * 4 + 1 ] = rgba[ 1 ]
		quad_col[ quad_index * 4 + 2 ] = rgba[ 2 ]
		quad_col[ quad_index * 4 + 3 ] = rgba[ 3 ]
		
		quad_index :+ 1
		
		quad_vert[ quad_index * 2 + 0 ] = x3
		quad_vert[ quad_index * 2 + 1 ] = y3
		
		quad_col[ quad_index * 4 + 0 ] = rgba[ 0 ]
		quad_col[ quad_index * 4 + 1 ] = rgba[ 1 ]
		quad_col[ quad_index * 4 + 2 ] = rgba[ 2 ]
		quad_col[ quad_index * 4 + 3 ] = rgba[ 3 ]
		
		quad_index :+ 1
		
		quad_vert[ quad_index * 2 + 0 ] = x4
		quad_vert[ quad_index * 2 + 1 ] = y4
		
		quad_col[ quad_index * 4 + 0 ] = rgba[ 0 ]
		quad_col[ quad_index * 4 + 1 ] = rgba[ 1 ]
		quad_col[ quad_index * 4 + 2 ] = rgba[ 2 ]
		quad_col[ quad_index * 4 + 3 ] = rgba[ 3 ]
		
		quad_index :+ 1
		elem_index :+ 6
		
		Return
	
	End Function


			Case PRIMITIVE_PACKED_QUAD
				glDrawElements( GL_TRIANGLES, elem_index, GL_UNSIGNED_INT, quad_element )

Derron

(Posted 2014) [#51]

it should result in 1 drawcall per texture used.

Couldnt "triangle strips" get used to send multiple "rectangular" shapes in one call? so 2 triangles form a rectangle ... to move to the next, you use a "zero area"-triangle (some kind of a "line") to move to the next 2-times-triangle-rectangle and so on.

As long as they share the same texture, this should avoid to call that whole thing multiple times.

Hmm as I have no clue about OGL I assume that something like "DrawArray(triangles)" exists and is already somehow optimized.

Also it might not be as fast as possible as you set the used shader to "0" - which according to
https://github.com/mattdesl/lwjgl-basics/wiki/ShaderProgram-Utility

should not be needed in current implementations ... but I do not worry as long as it just "works" ... improvement could be done later on.

bye
ron

zzz	(Posted 2014) [#52]

Well yes, but I think we were both on the track of reducing data transfer to the gpu. Since the quads will be disjoined if using strips or whatever anyways, I dont think there will be any difference in rendering speed compared to the example I posted.

If I understand the strips correctly it would take an additional two vertices to move to the next quad, which isnt really desireable, even if the gpu probably wouldnt bother rendering that part at all.

EDIT: Well one vertex, but thats still one 5 instead of 4 vertices per quad, and using triangle strips wont be faster then plain triangles. (Would it be possible to have 3 vertex quads, and have some clever shader or vertex code figure out where the fourth one should be?)

LT	(Posted 2014) [#53]

Couldnt "triangle strips" get used to send multiple "rectangular" shapes in one call?

Without a glPrimitiveRestartIndex call, no.

I don't see how extra vertices would help. The point of a strip is a continuous set of triangles. You can't have two separate quads using strips (without an index reset).

In any case, sending them as PLAIN_TRIANGLE will work, but it will require sending more vertices. Using glDrawElements can reduce that a bit; it's what I use in my own engine. Also, tracking changes like color and alpha and simply passing those values instead of passing vertex colors should make it faster.

Derron

(Posted 2014) [#54]

The point of a strip is a continuous set of triangles.

Might be the case... but it can and is used to have multiple "quads" (two triangles of course) in one call -- they then are connected via "zero area" triangles. They call it "degenerate triangle".

Nonetheless I do not want to disturb your tinker time .. sorry for the noob posting some rubbish here.

bye
Ron

Brucey

(Posted 2014) [#55]

Perhaps we can introduce something like LoadAtlasImage(), where we already have a similar LoadAnimImage().
It could take an array of coords for each sub image, and you draw with it in the usual way, passing in 'frame' for the particular sub image you want to draw?

Freetype-gl has a nice shader-based atlas implementation that we could maybe borrow from ?

Although I'm not sure how you apply origin, translation, scale and rotation via the modelview matrix... (which I assume is the place you are meant to apply such things?)

Derron

(Posted 2014) [#56]

If keeping things working as they work with BlitzMax vanilla, means also not to introduce new commands.

The engine "itself" should recognize if you kindly ask to draw from the same texture multiple times in a row.

Others plan to do batching this way:
http://www.cocos2d-x.org/wiki/Cocos2d_v30_renderer_pipeline_roadmap
(pay attention to the "reference"-links)

Seems they also just use "IDs" to decide wether this starts something new or could be done with the "previous" command. Of course "id" is a mixture of blendmodes, textureIDs etc. so they call it "key".

bye
Ron

LT	(Posted 2014) [#57]

Well, yes, using some kind of key will make sense in future versions. Also, it is possible to use a single array to store all of the vertex data for all of the primitives and pass offsets into the draw function. I'm not sure that would be faster, though.

Brucey

(Posted 2014) [#58]

There are no GL_QUADS in ES, and I think one probably shouldn't be using glBegin and glEnd either - they definitely don't exist in ES.

Brucey

(Posted 2014) [#59]

I made a completely separate set of arrays

Arrays are very inefficient - in comparison to accessing, for example, a Float Ptr by index.

Brucey

(Posted 2014) [#60]

@ render pipeline.

Essentially you need to throw away the old GL 1.2 code (all that stuff that uses Begin and End), and replace it with something else entirely (Shaders? I assume that's the best way to do everything?)

LT	(Posted 2014) [#61]

Arrays are very inefficient - in comparison to accessing, for example, a Float Ptr by index.

You pass a pointer into DrawElements, also. Something related to UpdateBuffers was causing my (Windows) engine to grind to a halt. I replaced the DrawArrays functions with DrawElements and now it plays much nicer. I've been using a pre-built index array for doing geometry picks for some time.

NOTE: I don't know what performance will be like on mobile devices, but once the SDL context and compile for Android options are available, I'll be happy to test on my Nexus 7.

LT	(Posted 2014) [#62]

New version is now available - UPDATED SOURCE ABOVE. Changed DrawArrays to DrawElements and implemented batching for textured tris. Blend and texture states are now saved and used as batching criteria. Also changed the texture upload function so that it doesn't use GL_UNPACK_ROW_LENGTH.

Derron

(Posted 2014) [#63]

@batching criteria

In a later stage I think the "target" is another criteria (render to texture).

@updated source
maybe add a "edit DATE"-line in that post.

bye
Ron

Brucey

(Posted 2014) [#64]

New version is now available

Cool. We're making some progress :-)

Now that I've got input (keyboard and mouse) working on the Pi (seems to be useful if you want to interact with it!), I can quit the app without having to do a reboot...

So, what we have rendering so far is the line, the oval and the polygon. No cross, text or rectangles.
(I'd do a screenshot but we are rendering from the console, rather than X11, so no access to the usual grabbers).

Any ideas re missing things? :o)

LT	(Posted 2014) [#65]

No idea. The missing rectangle is especially puzzling considering that it uses the exact same rendering method as the oval. I wonder if the rectangle would draw instead if you reversed the order...

EDIT: The only difference I can see with the rectangle is that the floats are defined inside of an array using # instead of .0 ...

Derron

(Posted 2014) [#66]

Will check tomorrow how RasPI-emulation works atm (virtualbox without arm, and QEMU with arm emulation) ... maybe it works even with another GPU getting emulated (works if same errors as brucey happen).

bye
Ron

Brucey

(Posted 2014) [#67]

The missing rectangle is especially puzzling considering that it uses the exact same rendering method as the oval

No, DrawPoly and DrawOval are working.
Plot, DrawRect and DrawImage are not rendering anything. Those DrawXXX functions appear to use glDrawElements(). Dunno if that has anything to do with it. I've tried re-ordering the indices, to no avail - although it still draws everything in OS X.

Still, it's nice to see *something*.

I also have been testing zzz's version of the module, which he sent me, and it renders everything as expected on the desktop, but I couldn't get anything to work at all on the Pi.

The hardest part is just getting everything set up *right*. Once we've done that, it's pretty much plain sailing ;-)

Brucey

(Posted 2014) [#68]

The other (minor) issue is that of the screen resolution. The Pi will *only* open a single, boot specified screen size for rendering. In my case, it's a standard 1920x1080 screen.
Whatever you ask via "Graphics x, y" it will only open that sized screen.
I suppose what you want is some kind of automatic scaling so that it *looks* like you are getting what you are asking for?

btw, I had to change SetResolution() to the following, because GL_PROJECTION is not available :

	Method SetResolution( width#, height# )

		u_pmatrix.SetOrthographic( 0, width, 0, height, -1, 1 )

	End Method

whereas yours was :

	Method SetResolution( width#, height# )

		glMatrixMode( GL_PROJECTION )
		glLoadIdentity()
		glOrtho( 0, width, height, 0, -1, 1 )
		glMatrixMode( GL_MODELVIEW )

	End Method

LT	(Posted 2014) [#69]

Hmm, maybe the Pi doesn't like glDrawElements, but the setup is pretty standard and I'd expect it to work. :/

LT	(Posted 2014) [#70]

I had to change SetResolution() to the following

I think I just commented it out temporarily when I was testing. 'Meant to put it back.

NOTE: The old Type is still there with its own SetResolution method.

Brucey

(Posted 2014) [#71]

Fixed DrawRect and DrawImage :o)

Came across this post.
So I changed the QUAD_INDS and friends to use Short instead of Int, and text and rects are rendering now. Yay!

I dropped BATCHSIZE to 32767 - that's the max size for a Short isn't it?

Now we just need to get Plot to... plot. Still no little cross.

LT	(Posted 2014) [#72]

I've tried re-ordering the indices

A quick way to be sure is to use glDisable( GL_CULL_FACE )...in case you haven't done that. I believe it is off, by default, but who knows on the Pi.

LT	(Posted 2014) [#73]

Fixed DrawRect and DrawImage

Sweeeet. :) I'm very curious to see if Digesteroids will be fast enough to be playable.

Derron

(Posted 2014) [#74]

@ missing plot

Maybe "gl_PointSize" needs to get defined in the shader?

bye
Ron

Brucey

(Posted 2014) [#75]

I'm very curious to see if Digesteroids will be fast enough to be playable.

Heh... more or less!
It's all working, anyway. Even the sound - albeit a bit crackly :-)

Framerate... not sure, a wee bit slower than full speed in-game.
The Instructions screen is very laggy - on account of DrawText being *very* (very!) inefficient. This is where my new FreeType-GL module will come in useful, as text is rendered through its own custom shader.

So, some work to make things more efficient, and we'll be there I think.

All-in-all, awesome!

LT	(Posted 2014) [#76]

on account of DrawText being *very* (very!) inefficient

Oh yeah, I noticed that each character uses its own texture id, which makes the batching useless. :/

Good news, in any case!

Brucey

(Posted 2014) [#77]

Texture Atlas is the way to go. Should make everything very zippy.

LT	(Posted 2014) [#78]

It can help DrawText, but anything else will need new commands.

Brucey

(Posted 2014) [#79]

anything else will need new commands.

Since DrawImage is the one that's going to do 99% of the work, that's where I thought we may be able to do something without changing too much?

As I mentioned before, perhaps we could simply add a new Load command, leaving the rest as is, and then you use DrawImage with the frame parameter - as you do with an anim image. Except in our case it would be rendering from a proper atlas.
Perhaps it's more complicated than I think it sounds like it should be? *shrug*

But if we need to add something to make things work *better*, then we need to add something. No big deal.
If everything else works the way it did before, then no-one loses out?

LT	(Posted 2014) [#80]

It's possible, but no frame number should be required in DrawImage (for compatibility). The image should know which atlas it belongs to and render accordingly.

In any case, I'm not sure you're going to see much improvement in "Rasperoids" even with the atlas. It's already batching textured quads - I'd be surprised if the frame rate improves by more than 10% or so.

Derron

(Posted 2014) [#81]

@ Drawing Points

Did you try to add the "gl_PointSize"-definition in the shader? Seems to be needed somehow. According to some stackoverflow postings

http://stackoverflow.com/questions/24055683/drawing-gl-points

http://stackoverflow.com/questions/24715097/gles-2-0-draws-gl-points-as-squares-on-android

This seems to be needed by "default" for EGL. And then you need GL_POINT_SMOOTH on OGL to get it "rounded" again.

Sorry if you already tried that and it was useless.

@Raspberry
I tried "qemu" to emulate the raspi - compilation is working but the emulation misses the GPU of the raspberry and therefore I cannot execute things using OpenGL (/dev/vchiq missing).
A pitty.

bye
Ron

GaryV

(Posted 2014) [#82]

Josh: Any progress to report?

Brucey

(Posted 2014) [#83]

Did you try to add the "gl_PointSize"-definition in the shader?

That's sorted it thanks :-)
Adding "gl_PointSize = 1.0;" renders the cross now.

Derron

(Posted 2014) [#84]

Cool thing ... did you try "atlasing" things?

(I mean drawing portions of a texture ?)

Another thing to test (albeit I think they will work) are the blend modes (lightblend, shadowblend ...).

When drawing from atlas, check if rotation/scaling does odd things.

bye
Ron

zzz	(Posted 2014) [#85]

It should be perfectly possible to have an automated atlas system just put in new images in sheets or whatever youd call it as they are loaded. The problem with that though is that performance might be unreliable for dev.

Consider doing whatever project, and your two most used images sits nicely on the same sheet. If youd then add something else inbetween, this might push one of these images onto another sheet, which could (depending on how much you actually draw) be quite noticeable since youd be back to constantly flushing because of changing texture ids.

It could probably be solved by tracking usage of each texture, but it would either have to cause random hiccups as atlases are rearranged on the fly, or require additional commands that gives the user more control.

Derron

(Posted 2014) [#86]

I wouldnt manipulate textures at all.

IF someone wants an engine to do the batching, they should use a custom "TBatchImage" which auto-reorganizes.

Another option is grouping ... grouped together items could get automatized by the engine (eg TBatchGroup.Add(image1) and so on).

The author/coder will know the best in which order he draws his sprites - and how this sprites are arranged on textures (I for example have all my figures-sprites on one texture, all gui elements on one etc - sometimes not perfect, but this reduced texture switches by somewhat).

Bringing in "automatism" or "intelligence" needs one of two things:
- simplifying things (only batch if things happen to reuse the same ressources)
- needs help of the coder (BatchGroup-Manager etc.)

The first thing is done using the "key"-approach (all things being disjoint to each other create something unique which can get used as reference whether a new batch-group will start).
After this step you will have multiple objects ordered by this key - within the elements of the same key you will have to group by "manual assigned batch groups".
Of course you could do it vice versa: first sort by "groups" and within that groups sort by key, but I think this adds more overhead - while it allows more influence of the coder.

bye
Ron

zzz	(Posted 2014) [#87]

Yeah but all that requires additional commands. I was pondering if it would be worthwile to put in some texture batching behind the scenes, but it could result in some unpredictable performance gains. (Which would feel more like random performance loss really)

@Brucey
Regarding the resolution issue. Do you have fbset or something similar available on your pi? (have no idea about what you are running on it, or much else regarding those things really :) )

Doing the scaling "manually" in the driver is probably a bad idea, since the pi seem to have pretty poor fillrate. Especially if it will require filling a full hd sized buffer.

Derron

(Posted 2014) [#88]

@additional commands
like said first do the simple things (auto texture batching)

if everything has settled down, one could extend the render pipeline with new commands (this would then be needed for all renderers - dunnow what happens to DX, or this gets replaced with "angle" then).

fbset should be available on the raspi.

bye
Ron

Brucey

(Posted 2014) [#89]

Regarding the resolution issue. Do you have fbset or something similar available on your pi?

There's a binary called "tvservice", which you can use to query/change the video mode of the gpu. If you run this with the appropriate arguments, you can change modes to something suitable for your game.
Interestingly, the app is open source, so it's not impossible to imagine re-using the Broadcom APIs that it calls in your own code so that tvservice is no longer required to do the switching.
Although I believe fbset is still required afterwards to tell the console that the screen resolution has changed.

Brucey

(Posted 2014) [#90]

I've been thinking about all the drawing stuff in Max2D, and how it does all the rotation/scaling/translating.

Wouldn't it be more efficient for the GPU to do this through matrices as raw data is passed to it?
I've no idea really how shaders work though, so I don't know if, when you push some vectors, you can also push the current rotate/scale/transform values too?

:o)

juankprada

(Posted 2014) [#91]

As far as I know there are a couple of ways to do that. One would be matrix manipulation of the model, But I think (i am very ignorant here)batch rendering here would depend on model matrix too, otherwise all objects in the batch would be rotated/scaled/translated the same. Another would be to actually specify vertex positions before being sent to the gpu depending on rotation and scaling (I use the second approach)

Derron

(Posted 2014) [#92]

A short "research" on this topic said manipulating the vertex positions is way slower than manipulating the matrix of the batch renderer.

But I think all of them are faster if they use some kind of "batching" (and the app utilizes it ).

bye
Ron

zzz	(Posted 2014) [#93]

I agree with juankprada on this one. It would introduce another flush state (or a ton more of data to move) which would probably break the auto-batcher more or less. At least in every scenario I can think of where I would benefit from the batching as it is now.

The shaders can either be supplied data on a per-primitive (ie vertices) or a per-drawcall basis. Even though the four verts in a quad uses the same transform values it would have to be sent to the gpu once for every vertex. Do it on a per-drawcall basis instead and you must flush the batcher every time the transform values change.

Think of for example a particle system using quads, in the current mod code it will most likely be rendered with a single drawcall, but if you use individual rotation or scaling etc (which would still use one drawcall in current code) the worst case scenario would become one drawcall per particle. Which would mean we are back at where we started performance-wise.

LT	(Posted 2014) [#94]

Passing a matrix per quad is pretty expensive. Particle systems avoid it, if they can. In a 2d system, sending three values like position x and y and rotation r is not so bad. Add a fourth for distance, if you like.

Like zzz said, it has to be done for every vertex, which is what makes it so inefficient. This gets crazy fast with geometry shaders, alas...

Derron

(Posted 2014) [#95]

When having a look how other frameworks handle it:
https://github.com/libgdx/libgdx/blob/master/gdx/src/com/badlogic/gdx/graphics/g2d/SpriteBatch.java

(they also have a CpuSpriteBatch.java)

It seems they just use CPU-Matrix for transformations.

According to some forums DX10-gpus might have problems with Geometric Shaders (up to make them slower than with the "old approach").

As the common case is NOT a particle system, I would prefer some kind of manual "batchSprite" object and options.
The default usage will batch as many things until a state change (then flushing and waiting for the next thing) but the advanced usage might then be to call a custom command ("EnableGPUTransformationMode(true)") - so it is up to the user if he wants to use a mode which might be slower for many texture bindings/draw calls.

bye
Ron

juankprada

(Posted 2014) [#96]

LibGDX doesnt use matrix to rotate/scale/translate sprites. If you take a look at line 223 you will see the method that actually adds a texture and vertices to the batch. You will notice there that rotation is done per vertex and the vertex are sent to the GPU already "rotated" (line 270)/"scaled" (line 243)/"translated" (lines 235 and 236) but without matrix manipulation. The model matrix is always the identity matrix. I know that because I replicated their spritebatch in Java with a different opengl wrapper (JOGL instead of LWJGL)

Derron

(Posted 2014) [#97]

https://github.com/libgdx/libgdx/blob/master/gdx/src/com/badlogic/gdx/graphics/g2d/CpuSpriteBatch.java

Isnt the cpubatchsprite working with a matrix?

I did not read /checkout when which variant is used.

Bye
Ron

Brucey

(Posted 2014) [#98]

This is why it's better for people who know how stuff works to come up with ideas :-)

So, as it stands, are we happy with the way the shader-based module is currently working?

Is there anything obvious, that if it were to be implemented, would provide an order-of-magnitude improvement to renderings?

I'm personally happy with the state of things, as it's rendering stuff correctly on previously unavailable platforms (eg. OpenGL ES 2.0 targets)

Derron

(Posted 2014) [#99]

Especially the mobile targets will show performance wise bottle necks.

"Graphics intense" (read "many sprites") apps might be problematic - in that case we should think about additional "helper" classes - to enforce specific caching behaviour etc. I did not check if "basic caching" is done already - as it was suggested before - so there is a potential spot to optimize without much needed "intervention" on your side.

bye
Ron

zzz	(Posted 2014) [#100]

Its probably pretty good to go.

[edit]

The one thing that might be worth looking into is the flush/batch drawing code (ie where the batcher code decides if it needs to draw or not). Most of it is my own code, but I never put much thought into it as it was just for testing. Could probably be a bit more lean :)

Were also uploading the projection matrix on every drawcall (unless i had a brainfart why peeking through the code). Its pretty big, and only needs to be uploaded when we switch shaderprograms or change resolution.

Wouldnt expect an order-of-magnitude performance improvement from any of it, but should still be worth looking into.

LT	(Posted 2014) [#101]

Other than zzz's suggestion of limiting the passing of the projection matrix, I don't know how else to make it faster and maintain the command set. Keep in mind that we are setting array data on every DrawThis() call. A better system would be to create persistent objects that only modify the array when they are changed, but that would require a different set of functions.