Fast graphics... O.o

BlitzMax Forums/BlitzMax Programming/Fast graphics... O.o

BinaryBurst(Posted 2011) [#1]
I was thinking if we can make graphics a lot faster in blitz by taking a shortcut in the functions.... like this:
The drawimage() function looks like this:

Function DrawImage( image:TImage,x#,y#,frame=0 )
	Local x0#=-image.handle_x,x1#=x0+image.width
	Local y0#=-image.handle_y,y1#=y0+image.height
	Local iframe:TImageFrame=image.Frame(frame)
	If iframe iframe.Draw x0,y0,x1,y1,x+gc.origin_x,y+gc.origin_y,0,0,image.width,image.height
End Function


well... if we can get rid of the 'private' and 'public' lines from the "BlitzMax\mod\brl.mod\max2d.mod\max2d.bmx" file from the modules folder, so that you can access the gc variable:

Private //cut

Global gc:TMax2DGraphics

Function UpdateTransform()
	Local s#=Sin(gc.tform_rot)
	Local c#=Cos(gc.tform_rot)
	gc.tform_ix= c*gc.tform_scale_x
	gc.tform_iy=-s*gc.tform_scale_y
	gc.tform_jx= s*gc.tform_scale_x
	gc.tform_jy= c*gc.tform_scale_y
	_max2dDriver.SetTransform gc.tform_ix,gc.tform_iy,gc.tform_jx,gc.tform_jy
	SetCollisions2DTransform gc.tform_ix,gc.tform_iy,gc.tform_jx,gc.tform_jy
End Function

Public //cut


we can then rewrite the function.
instead of this:
     code...
     cls
     drawimage(image:timage,x,y)
     flip
     code...

we get this:
     code...
     cls
     image.Frame(frame).Draw( -image.handle_x, -image.handle_y ,-image.handle_x+image.width, -image.handle_y+image.height, x+gc.origin_x, y+gc.origin_y, 0, 0 , image.width, image.height )
     flip
     code...


Can somebody test this? I think you should get 17 times faster image draws..(cause you're reducing the 7 instructions to just 1 plus you get rid of the function drawimage() which is equivalent to 10 instructions)
Thanks :D

Last edited 2011


Brucey(Posted 2011) [#2]
Max2D is a generic module which takes the pain out of you having to manage transformations yourself. Hence all the code in DrawImage, and you only having to use x, y to draw your image.

Of course, if you want to customise things, you can make things much faster.
In fact, you could speed it up again by calling the GL/DX routines directly, thus saving from another set of wrapping layer.


BinaryBurst(Posted 2011) [#3]
I know the purpose of it... :D But I was just wondering...


BinaryBurst(Posted 2011) [#4]
And if you could give us the code with gl/dx, we would be infinitely grateful. :DDDD ( Don't want to insult anyone's work to make blitz simple )

Not to mention you can make an optimizing program that replaces all the drawimage() from your code with the shortcut...

Last edited 2011


matibee(Posted 2011) [#5]
<ot>The words "tree", "barking up" and "wrong" spring to mind. :)</ot>

Anyway, you already have the specific max code for directx7, dx9 and opengl in the respective mod/brl.mod/ folders.


BinaryBurst(Posted 2011) [#6]
I just wanted to know if it really works. (it works theoretically)
I just want to know how much faster would it be. That's all :)


Brucey(Posted 2011) [#7]
Well, how much faster was it for you?


matibee(Posted 2011) [#8]
I don't think removing a few floating point operations will help much as the backlog will be pushing individual polygons onto the graphics card for each draw call.

There was a mod that batched primitives but I'm not sure what happened to it. Anyone remember?


Jesse(Posted 2011) [#9]
Having access to the private variables serves me well, not really for the speed of the graphics but for other uses.
This will give you full access to it:
mygc: Tmax2dGraphics = TMax2dgraphics.Current()


Keep in mind that if Mark decides to modify any of the low level stuff, you might have to modify your program or render it useless.

Last edited 2011


BinaryBurst(Posted 2011) [#10]
@matibee

try these two programs:

t=millisecs()
for i=1 to 10000000
   a:+1
next
print millisecs()-t


global a=0
t=millisecs()
for i=1 to 10000000
   f()
next
print millisecs()-t

function f()
   a:+1
endfunction


I get these results: 232ms and 1471ms
There seems to be a difference :)...

Last edited 2011


Brucey(Posted 2011) [#11]
I get 3 and 31.

I don't think it is significant.


BinaryBurst(Posted 2011) [#12]
Well consider this:
The thing that uses the most cpu time are the graphics. And in blitz that is the drawimage() command. So if all the drawimages() combined consumed 50% of cpu time then if it was at least 4 times as fast that means you can either draw 4 times more images or make the game run smoother. :)
Currently i can draw 5,000 plots on the screen at 60 fps so if i can make that 10 times better that would mean 50,000 plots on the screen which, in my opinion, is huge! (only if it works..)


BinaryBurst(Posted 2011) [#13]
@Brucey

That's huge!!! 10 times less :O :) (Hmmm... 60fps=16ms => 32ms=30fps though i doubt anyone renders 10,000,000 images =) )

Last edited 2011


Brucey(Posted 2011) [#14]
This is in non-debug mode.

I also used SuperStrict mode which doesn't make the code any faster, but it does make it less prone to basic user-error.


Jesse(Posted 2011) [#15]
No, drawing is not going to be 4 times as fast. Try it with a typical image say a 32x32 you will probably save something along the line of a few milliseconds. that will be because in this case the GPU will still do most of the work which is where the bottle neck actually is.

Last edited 2011


ImaginaryHuman(Posted 2011) [#16]
You'd be better off getting gains from using things like vertex arrays, display lists, etc


Armitage 1982(Posted 2011) [#17]
You'd be better off getting gains from using things like vertex arrays, display lists, etc

Is mojo different regarding to this ?


BinaryBurst(Posted 2011) [#18]
I did the test and... I got 1ms less for 10,000 images drawn. So I get no difference. (Epic fail XD). But I get something really strange. I have a game were I can't draw more that 150 images at 60 fps and I don't know what is lagging so much. The code time is linear O(n). It seems something is drastically lowering the fps.


Beaker(Posted 2011) [#19]
Have you tried:
Flip False
and/or the DX11 version of max2d?
http://www.blitzbasic.com/Community/posts.php?topic=96014


BinaryBurst(Posted 2011) [#20]
I'll give it a try. I found a very good precompiler that gives me a good idea of what's really happening in the game :D and it turned out that the graphics are the most time consuming.

Last edited 2011


BinaryBurst(Posted 2011) [#21]
Yes I did use Flip(0) with all the builds.


BinaryBurst(Posted 2011) [#22]
I get no difference with directx11.

Import SRS.D3D11Max2D
SetGraphicsDriver D3D11Max2DDriver()

im:TImage=LoadImage(LoadBank("http::eu.media.blizzard.com/sc2/media/wallpapers/wall004/wall004-large.jpg"))

Graphics 640,480
t=MilliSecs()
For i=1 To 1000
	DrawImage(im,0,0)
Next
Flip(0)
Print MilliSecs()-t
EndGraphics

SetGraphicsDriver D3D7Max2DDriver()

Graphics 640,480
t=MilliSecs()
For i=1 To 1000
	DrawImage(im,0,0)
Next
Flip(0)
Print MilliSecs()-t



Result:
1152
1020



BinaryBurst(Posted 2011) [#23]
I get glmax2ddriver the fastest

SetGraphicsDriver D3D7Max2DDriver()
im:TImage=LoadImage(LoadBank("http::eu.media.blizzard.com/sc2/media/wallpapers/wall004/wall004-large.jpg"),MASKEDIMAGE)

Graphics 640,480
t=MilliSecs()
For i=1 To 10000
	DrawImage(im,0,0)
Next
Flip(0)
Print MilliSecs()-t
EndGraphics

SetGraphicsDriver GLMax2DDriver()

Graphics 640,480
t=MilliSecs()
For i=1 To 10000
	DrawImage(im,0,0)
Next
Flip(0)
Print MilliSecs()-t


Result:
9421
116


Last edited 2011


col(Posted 2011) [#24]
Have you tried using TBatchImage in the Dx11 driver? There are extra functions for batching plots, lines, and images giving significant speed increases.

BTW, I doubt this is the problem with your 150 image scenario.

Last edited 2011


Oddball(Posted 2011) [#25]
Instead of spending so much time trying to squeeze extra speed out of the Max2D drivers you would be better off writing your own driver from the ground up. That way you wouldn't be bound by Max2D's command set and you could also start optimising for speed from the very beginning instead of attempting to retro fit speed optimisations into a framework that wasn't built for speed.


nitti(Posted 2012) [#26]
Hello! So I've been trying to create a much faster drawimagerect routine and found this thread, my laptop doesn't do openGl above 1.4 and I am a total beginner at OpenGl so it's probably a very crude solution.

the routine is quite a bit faster (depending on amount of calls between 10% and 100% !!)

i've made a little test routine that does a couple of runs of the drawsubimagerect and my function (called drawPreLoadedTex)

you'll need to have an image called image.png in the same directory (try 256x256 px for size and make it a tileset of 32x32 sized tiles)


SuperStrict

SetGraphicsDriver GLMax2DDriver()

Graphics 1024,768

Global  image:TImage= LoadImage("image.png", MASKEDIMAGE)
Global pixmap:TPixmap = LoadPixmap("image.png")
Global imageTex:Int = TexFromPixmap(pixmap)

Local compareAmount:Int = 50

Local results:Float[compareAmount]
Local results2:Float[compareAmount]

While Not KeyDown(key_escape)
	Cls

	For Local r:Int = 0 Until compareAmount
		results[r] = testDrawLoop(5,40,40,0)
		Flip()
	Next
	
	Print averageFloatArray(results)+" fps on average for Drawsubimagerect"
	
	For Local r:Int = 0 Until compareAmount
		results2[r] = testDrawLoop(5,40,40,1)
		Flip()
	Next
	
	Print averageFloatArray(results2)+" fps on average for drawPreLoaded"
	
	WaitKey
	End
Wend



Function averageFloatArray:Float(array:Float[])
	Local total:Float = 0
	For Local f:Float = EachIn array
		total:+f
	Next
	total = total/array.length
	Return total
EndFunction



Function initTex(tex:Int)
	glEnable(GL_TEXTURE_2D);
	glBindTexture(GL_TEXTURE_2D, tex);
	glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_WRAP_S,GL_CLAMP_TO_EDGE)
	glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_WRAP_T,GL_CLAMP_TO_EDGE)
	glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MAG_FILTER,GL_NEAREST)
	glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MIN_FILTER,GL_NEAREST)
EndFunction

Function drawPreLoadedTex(x:Int, y:Int, w:Int, h:Int, sx:Int, sy:Int, sw:Int, sh:Int)
	Local texWidth:Int=pixmap.width
	Local texHeight:Int=pixmap.height
	
	Local sx1:Float = Float(sx)/texWidth
	Local sy1:Float = Float(sy)/texHeight
	Local sw1:Float = Float(sw)/texWidth
	Local sh1:Float = Float(sh)/texHeight
	
	glTexCoord2f(sx1,sy1)
	glVertex2f(x,y)
	glTexCoord2f(sx1+sw1,sy1)
	glVertex2f(x+w,y)
	glTexCoord2f(sx1+sw1,sy1+sh1)
	glVertex2f(x+w,y+h)
	glTexCoord2f(sx1,sy1+sh1)
	glVertex2f(x,y+h)
	
EndFunction


Function testDrawLoop:Int(runs:Int, w:Int,h:Int, drawKind:Int=0)
	Local t:Int = MilliSecs()
	
	If drawKind=0
		For Local z:Int=0 Until runs
			For Local x:Int = 0 Until w
				For Local y:Int = 0 Until h
					DrawSubImageRect image,x*32,y*32,32,32,32*Rand(10),32*1,32,32
				Next
			Next	
		Next
	ElseIf drawkind=1
		initTex(imageTex)
		glBegin(GL_QUADS)
		For Local z:Int=0 Until runs
			For Local x:Int = 0 Until w
				For Local y:Int = 0 Until h
				'SetColor Rand(255),Rand(255),Rand(255)
				drawPreLoadedTex(x*32,y*32,32,32,32*Rand(10),32*1,32,32)
				Next
			Next
		Next	
		
		glEnd()
	EndIf

	Return( 1000.0/(MilliSecs()-t))
EndFunction



Function AdjustTexSize(Width:Int Var, Height:Int Var)
	Function Pow2Size:Int(N:Int)
		Local Size:Int

		Size = 1
		While Size < N
			Size = Size Shl 1
		Wend

		Return Size
	End Function

	Width  = Pow2Size(Width)
	Height = Pow2Size(Height)
End Function

Function TexFromPixmap:Int(pixmap:TPixmap, mipmap:Int = True)
	If pixmap.format<>PF_RGBA8888 pixmap=pixmap.Convert( PF_RGBA8888 )
	Local width:Int=pixmap.width,height:Int=pixmap.height
	AdjustTexSize width,height
	If width<>pixmap.width Or height<>pixmap.height pixmap=ResizePixmap( pixmap,width,height )
	
	Local old_name:Int,old_row_len:Int
	glGetIntegerv GL_TEXTURE_BINDING_2D,Varptr old_name
	glGetIntegerv GL_UNPACK_ROW_LENGTH,Varptr old_row_len

	Local Name:Int
	glGenTextures 1,Varptr name
	glBindtexture GL_TEXTURE_2D,name
	
	Local mip_level:Int
	Repeat
		glPixelStorei GL_UNPACK_ROW_LENGTH,pixmap.pitch/BytesPerPixel[pixmap.format]
		glTexImage2D(GL_TEXTURE_2D, mip_level, GL_RGBA8, Width, Height, 0, GL_RGBA, GL_UNSIGNED_BYTE, pixmap.Pixels)
		If Not mipmap Exit
		If width=1 And height=1 Exit
		If width>1 width:/2
		If height>1 height:/2
		pixmap=ResizePixmap( pixmap,width,height )
		mip_level:+1
	Forever
	
	glBindTexture GL_TEXTURE_2D,old_name
	glPixelStorei GL_UNPACK_ROW_LENGTH,old_row_len

	Return name
End Function


the idea is this:
load a texture
glbegin
for a lot of tiles
do straight opengl calls (with the already loaded texture)
next
glend

instead of all the glbegin/glends you'd get by using draw,
next up I might look into displaylists

hope it might be usefull to someone


ImaginaryHuman(Posted 2012) [#27]
Draw calls are expensive so instead of individual images you should do as few texture swaps as possibly by using a sprite atlas and texture coords. You also need tiger beyond immediate mode GL commands and look at vertex arrays/display lists otherwise you are pushing new geometry over the graphics bus every frame.


nitti(Posted 2012) [#28]
"You also need tiger beyond immediate mode GL commands"
this reads to me like Tiger has written some nice functions ? if thats the case, how do I find them? Or am I missing something else ?


slenkar(Posted 2012) [#29]
sounds more like a spellcheck or suri error

it was probably

'to go beyond'