Fast graphics... O.o
BlitzMax Forums/BlitzMax Programming/Fast graphics... O.o
| ||
I was thinking if we can make graphics a lot faster in blitz by taking a shortcut in the functions.... like this: The drawimage() function looks like this: Function DrawImage( image:TImage,x#,y#,frame=0 ) Local x0#=-image.handle_x,x1#=x0+image.width Local y0#=-image.handle_y,y1#=y0+image.height Local iframe:TImageFrame=image.Frame(frame) If iframe iframe.Draw x0,y0,x1,y1,x+gc.origin_x,y+gc.origin_y,0,0,image.width,image.height End Function well... if we can get rid of the 'private' and 'public' lines from the "BlitzMax\mod\brl.mod\max2d.mod\max2d.bmx" file from the modules folder, so that you can access the gc variable: Private //cut Global gc:TMax2DGraphics Function UpdateTransform() Local s#=Sin(gc.tform_rot) Local c#=Cos(gc.tform_rot) gc.tform_ix= c*gc.tform_scale_x gc.tform_iy=-s*gc.tform_scale_y gc.tform_jx= s*gc.tform_scale_x gc.tform_jy= c*gc.tform_scale_y _max2dDriver.SetTransform gc.tform_ix,gc.tform_iy,gc.tform_jx,gc.tform_jy SetCollisions2DTransform gc.tform_ix,gc.tform_iy,gc.tform_jx,gc.tform_jy End Function Public //cut we can then rewrite the function. instead of this: code... cls drawimage(image:timage,x,y) flip code... we get this: code... cls image.Frame(frame).Draw( -image.handle_x, -image.handle_y ,-image.handle_x+image.width, -image.handle_y+image.height, x+gc.origin_x, y+gc.origin_y, 0, 0 , image.width, image.height ) flip code... Can somebody test this? I think you should get 17 times faster image draws..(cause you're reducing the 7 instructions to just 1 plus you get rid of the function drawimage() which is equivalent to 10 instructions) Thanks :D Last edited 2011 |
| ||
Max2D is a generic module which takes the pain out of you having to manage transformations yourself. Hence all the code in DrawImage, and you only having to use x, y to draw your image. Of course, if you want to customise things, you can make things much faster. In fact, you could speed it up again by calling the GL/DX routines directly, thus saving from another set of wrapping layer. |
| ||
I know the purpose of it... :D But I was just wondering... |
| ||
And if you could give us the code with gl/dx, we would be infinitely grateful. :DDDD ( Don't want to insult anyone's work to make blitz simple ) Not to mention you can make an optimizing program that replaces all the drawimage() from your code with the shortcut... Last edited 2011 |
| ||
<ot>The words "tree", "barking up" and "wrong" spring to mind. :)</ot> Anyway, you already have the specific max code for directx7, dx9 and opengl in the respective mod/brl.mod/ folders. |
| ||
I just wanted to know if it really works. (it works theoretically) I just want to know how much faster would it be. That's all :) |
| ||
Well, how much faster was it for you? |
| ||
I don't think removing a few floating point operations will help much as the backlog will be pushing individual polygons onto the graphics card for each draw call. There was a mod that batched primitives but I'm not sure what happened to it. Anyone remember? |
| ||
Having access to the private variables serves me well, not really for the speed of the graphics but for other uses. This will give you full access to it: mygc: Tmax2dGraphics = TMax2dgraphics.Current() Keep in mind that if Mark decides to modify any of the low level stuff, you might have to modify your program or render it useless. Last edited 2011 |
| ||
@matibee try these two programs: t=millisecs() for i=1 to 10000000 a:+1 next print millisecs()-t global a=0 t=millisecs() for i=1 to 10000000 f() next print millisecs()-t function f() a:+1 endfunction I get these results: 232ms and 1471ms There seems to be a difference :)... Last edited 2011 |
| ||
I get 3 and 31. I don't think it is significant. |
| ||
Well consider this: The thing that uses the most cpu time are the graphics. And in blitz that is the drawimage() command. So if all the drawimages() combined consumed 50% of cpu time then if it was at least 4 times as fast that means you can either draw 4 times more images or make the game run smoother. :) Currently i can draw 5,000 plots on the screen at 60 fps so if i can make that 10 times better that would mean 50,000 plots on the screen which, in my opinion, is huge! (only if it works..) |
| ||
@Brucey That's huge!!! 10 times less :O :) (Hmmm... 60fps=16ms => 32ms=30fps though i doubt anyone renders 10,000,000 images =) ) Last edited 2011 |
| ||
This is in non-debug mode. I also used SuperStrict mode which doesn't make the code any faster, but it does make it less prone to basic user-error. |
| ||
No, drawing is not going to be 4 times as fast. Try it with a typical image say a 32x32 you will probably save something along the line of a few milliseconds. that will be because in this case the GPU will still do most of the work which is where the bottle neck actually is. Last edited 2011 |
| ||
You'd be better off getting gains from using things like vertex arrays, display lists, etc |
| ||
You'd be better off getting gains from using things like vertex arrays, display lists, etc Is mojo different regarding to this ? |
| ||
I did the test and... I got 1ms less for 10,000 images drawn. So I get no difference. (Epic fail XD). But I get something really strange. I have a game were I can't draw more that 150 images at 60 fps and I don't know what is lagging so much. The code time is linear O(n). It seems something is drastically lowering the fps. |
| ||
Have you tried: Flip False and/or the DX11 version of max2d? http://www.blitzbasic.com/Community/posts.php?topic=96014 |
| ||
I'll give it a try. I found a very good precompiler that gives me a good idea of what's really happening in the game :D and it turned out that the graphics are the most time consuming. Last edited 2011 |
| ||
Yes I did use Flip(0) with all the builds. |
| ||
I get no difference with directx11. Import SRS.D3D11Max2D SetGraphicsDriver D3D11Max2DDriver() im:TImage=LoadImage(LoadBank("http::eu.media.blizzard.com/sc2/media/wallpapers/wall004/wall004-large.jpg")) Graphics 640,480 t=MilliSecs() For i=1 To 1000 DrawImage(im,0,0) Next Flip(0) Print MilliSecs()-t EndGraphics SetGraphicsDriver D3D7Max2DDriver() Graphics 640,480 t=MilliSecs() For i=1 To 1000 DrawImage(im,0,0) Next Flip(0) Print MilliSecs()-t Result: 1152 1020 |
| ||
I get glmax2ddriver the fastestSetGraphicsDriver D3D7Max2DDriver() im:TImage=LoadImage(LoadBank("http::eu.media.blizzard.com/sc2/media/wallpapers/wall004/wall004-large.jpg"),MASKEDIMAGE) Graphics 640,480 t=MilliSecs() For i=1 To 10000 DrawImage(im,0,0) Next Flip(0) Print MilliSecs()-t EndGraphics SetGraphicsDriver GLMax2DDriver() Graphics 640,480 t=MilliSecs() For i=1 To 10000 DrawImage(im,0,0) Next Flip(0) Print MilliSecs()-t Result: 9421 116 Last edited 2011 |
| ||
Have you tried using TBatchImage in the Dx11 driver? There are extra functions for batching plots, lines, and images giving significant speed increases. BTW, I doubt this is the problem with your 150 image scenario. Last edited 2011 |
| ||
Instead of spending so much time trying to squeeze extra speed out of the Max2D drivers you would be better off writing your own driver from the ground up. That way you wouldn't be bound by Max2D's command set and you could also start optimising for speed from the very beginning instead of attempting to retro fit speed optimisations into a framework that wasn't built for speed. |
| ||
Hello! So I've been trying to create a much faster drawimagerect routine and found this thread, my laptop doesn't do openGl above 1.4 and I am a total beginner at OpenGl so it's probably a very crude solution. the routine is quite a bit faster (depending on amount of calls between 10% and 100% !!) i've made a little test routine that does a couple of runs of the drawsubimagerect and my function (called drawPreLoadedTex) you'll need to have an image called image.png in the same directory (try 256x256 px for size and make it a tileset of 32x32 sized tiles) SuperStrict SetGraphicsDriver GLMax2DDriver() Graphics 1024,768 Global image:TImage= LoadImage("image.png", MASKEDIMAGE) Global pixmap:TPixmap = LoadPixmap("image.png") Global imageTex:Int = TexFromPixmap(pixmap) Local compareAmount:Int = 50 Local results:Float[compareAmount] Local results2:Float[compareAmount] While Not KeyDown(key_escape) Cls For Local r:Int = 0 Until compareAmount results[r] = testDrawLoop(5,40,40,0) Flip() Next Print averageFloatArray(results)+" fps on average for Drawsubimagerect" For Local r:Int = 0 Until compareAmount results2[r] = testDrawLoop(5,40,40,1) Flip() Next Print averageFloatArray(results2)+" fps on average for drawPreLoaded" WaitKey End Wend Function averageFloatArray:Float(array:Float[]) Local total:Float = 0 For Local f:Float = EachIn array total:+f Next total = total/array.length Return total EndFunction Function initTex(tex:Int) glEnable(GL_TEXTURE_2D); glBindTexture(GL_TEXTURE_2D, tex); glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_WRAP_S,GL_CLAMP_TO_EDGE) glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_WRAP_T,GL_CLAMP_TO_EDGE) glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MAG_FILTER,GL_NEAREST) glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MIN_FILTER,GL_NEAREST) EndFunction Function drawPreLoadedTex(x:Int, y:Int, w:Int, h:Int, sx:Int, sy:Int, sw:Int, sh:Int) Local texWidth:Int=pixmap.width Local texHeight:Int=pixmap.height Local sx1:Float = Float(sx)/texWidth Local sy1:Float = Float(sy)/texHeight Local sw1:Float = Float(sw)/texWidth Local sh1:Float = Float(sh)/texHeight glTexCoord2f(sx1,sy1) glVertex2f(x,y) glTexCoord2f(sx1+sw1,sy1) glVertex2f(x+w,y) glTexCoord2f(sx1+sw1,sy1+sh1) glVertex2f(x+w,y+h) glTexCoord2f(sx1,sy1+sh1) glVertex2f(x,y+h) EndFunction Function testDrawLoop:Int(runs:Int, w:Int,h:Int, drawKind:Int=0) Local t:Int = MilliSecs() If drawKind=0 For Local z:Int=0 Until runs For Local x:Int = 0 Until w For Local y:Int = 0 Until h DrawSubImageRect image,x*32,y*32,32,32,32*Rand(10),32*1,32,32 Next Next Next ElseIf drawkind=1 initTex(imageTex) glBegin(GL_QUADS) For Local z:Int=0 Until runs For Local x:Int = 0 Until w For Local y:Int = 0 Until h 'SetColor Rand(255),Rand(255),Rand(255) drawPreLoadedTex(x*32,y*32,32,32,32*Rand(10),32*1,32,32) Next Next Next glEnd() EndIf Return( 1000.0/(MilliSecs()-t)) EndFunction Function AdjustTexSize(Width:Int Var, Height:Int Var) Function Pow2Size:Int(N:Int) Local Size:Int Size = 1 While Size < N Size = Size Shl 1 Wend Return Size End Function Width = Pow2Size(Width) Height = Pow2Size(Height) End Function Function TexFromPixmap:Int(pixmap:TPixmap, mipmap:Int = True) If pixmap.format<>PF_RGBA8888 pixmap=pixmap.Convert( PF_RGBA8888 ) Local width:Int=pixmap.width,height:Int=pixmap.height AdjustTexSize width,height If width<>pixmap.width Or height<>pixmap.height pixmap=ResizePixmap( pixmap,width,height ) Local old_name:Int,old_row_len:Int glGetIntegerv GL_TEXTURE_BINDING_2D,Varptr old_name glGetIntegerv GL_UNPACK_ROW_LENGTH,Varptr old_row_len Local Name:Int glGenTextures 1,Varptr name glBindtexture GL_TEXTURE_2D,name Local mip_level:Int Repeat glPixelStorei GL_UNPACK_ROW_LENGTH,pixmap.pitch/BytesPerPixel[pixmap.format] glTexImage2D(GL_TEXTURE_2D, mip_level, GL_RGBA8, Width, Height, 0, GL_RGBA, GL_UNSIGNED_BYTE, pixmap.Pixels) If Not mipmap Exit If width=1 And height=1 Exit If width>1 width:/2 If height>1 height:/2 pixmap=ResizePixmap( pixmap,width,height ) mip_level:+1 Forever glBindTexture GL_TEXTURE_2D,old_name glPixelStorei GL_UNPACK_ROW_LENGTH,old_row_len Return name End Function the idea is this: load a texture glbegin for a lot of tiles do straight opengl calls (with the already loaded texture) next glend instead of all the glbegin/glends you'd get by using draw, next up I might look into displaylists hope it might be usefull to someone |
| ||
Draw calls are expensive so instead of individual images you should do as few texture swaps as possibly by using a sprite atlas and texture coords. You also need tiger beyond immediate mode GL commands and look at vertex arrays/display lists otherwise you are pushing new geometry over the graphics bus every frame. |
| ||
"You also need tiger beyond immediate mode GL commands" this reads to me like Tiger has written some nice functions ? if thats the case, how do I find them? Or am I missing something else ? |
| ||
sounds more like a spellcheck or suri error it was probably 'to go beyond' |