Why is drawrect/drawline so much slower than ...

BlitzMax Forums/BlitzMax Programming/Why is drawrect/drawline so much slower than ...

Grey Alien(Posted 2008) [#1]
DrawImage?

I was drawing some rectangles in my game and realised that the FPS was suffering badly and when I replaced them with pre-made images that I loaded in, it was considerably faster. I gained a large amount of FPS well worth making the change.

Similarly I was drawing some small lines (8 pairs), which were impacting the FPS a lot. I replaced them with a 2 pixel tall image which I drew 8 times and it was hugely faster.

Using DX on Vista on a slightly crappy 256MB video card.


therevills(Posted 2008) [#2]
Its probably the same/similar reason why drawtext is so much slower than using bitmap fonts...

I use drawrect when I'm prototyping, but replace them later with images...

OT: GA can you check out a post I made about TMySprite, regarding animation on your forum...


plash(Posted 2008) [#3]
This hardly makes sense.. surely stuffing specific pixel data onto the buffer is more CPU intensive.

Its properly the same reason via drawtext is so much slower than using bitmap fonts...
Why would that be?


Tachyon(Posted 2008) [#4]
Its probably the same/similar reason why drawtext is so much slower than using bitmap fonts...

You know, I haven't found this to be true. Maybe some bitmap font modules are better than others, but the one time I tried to implement a bitmap font engine into my game it slowed it down...I figured it was because I was drawing individual bitmaps for each letter, whereas with DrawText I am drawing entire sentences at a time. I don't know if this logic is accurate or not, but using a bitmap font added 2ms to my draw_screen loop, so I went back to DrawText.


plash(Posted 2008) [#5]
I don't know if this logic is accurate or not, but using a bitmap font added 2ms to my draw_screen loop, so I went back to DrawText.
Single surface tends to be faster :) (which is what Fontext uses.)


Brucey(Posted 2008) [#6]
I was drawing some rectangles in my game

I've mentioned it many times before...

The default drawing routines in BlitzMax are not very efficient.

This tends to happens when you 1) Make something easy to use, 2) make something generic.

For example... You want to draw a Rect...
This is the initial code in DrawRect:
	_max2dDriver.DrawRect..
	gc.handle_x,gc.handle_y,..
	gc.handle_x+width,gc.handle_y+height,..
	x+gc.origin_x,y+gc.origin_y

And then we look at the gl 2d driver to see what its DrawRect looks like:
		DisableTex
		glBegin GL_QUADS
		glVertex2f x0*ix+y0*iy+tx,x0*jx+y0*jy+ty
		glVertex2f x1*ix+y0*iy+tx,x1*jx+y0*jy+ty
		glVertex2f x1*ix+y1*iy+tx,x1*jx+y1*jy+ty
		glVertex2f x0*ix+y1*iy+tx,x0*jx+y1*jy+ty
		glEnd

Is someone going to tell me that is the most efficient way possible to draw a rectangle?

Remember, the code is generic. It works for every rectangle you want to draw - ever.
But, what if you are drawing 10 rects always in the same place? Do you really need to recalc everything every time?

Anyhoo..

Same applies to all the draw routines, essentially.

What BlitzMax needs is a "Fast" Max2D version... including all those nice things tonyg wants :-)


GaryV(Posted 2008) [#7]
What BlitzMax needs is a "Fast" Max2D version...
Is this a hint of a new Brucey module in the works? :p


MGE(Posted 2008) [#8]
hmm...I've done similar tests and didn't get a faster performance. Time for a new benchmark Jake? ;)


plash(Posted 2008) [#9]
@Brucey: That doesn't explain why using an image is faster.. In fact, DrawImage seems to be using even more calculation.

DrawImage:
	Local x0#=-image.handle_x,x1#=x0+image.width
	Local y0#=-image.handle_y,y1#=y0+image.height
	Local iframe:TImageFrame=image.Frame(frame)
	If iframe iframe.Draw x0,y0,x1,y1,x+gc.origin_x,y+gc.origin_y


TImageFrame.Draw:
		Assert seq=GraphicsSeq Else "Image does not exist"
		EnableTex name
		glBegin GL_QUADS
		glTexCoord2f u0,v0
		glVertex2f x0*ix+y0*iy+tx,x0*jx+y0*jy+ty
		glTexCoord2f u1,v0
		glVertex2f x1*ix+y0*iy+tx,x1*jx+y0*jy+ty
		glTexCoord2f u1,v1
		glVertex2f x1*ix+y1*iy+tx,x1*jx+y1*jy+ty
		glTexCoord2f u0,v1
		glVertex2f x0*ix+y1*iy+tx,x0*jx+y1*jy+ty
		glEnd



Brucey(Posted 2008) [#10]
That doesn't explain why using an image is faster

No idea m8 :-)

All I can see is lots and lots of calculations going on per call. Someone tell me that is efficient?


Grey Alien(Posted 2008) [#11]
All I did in my game was replace DrawRect with DrawImage (same size) and my FPS shot up (with VSync off). Same with DrawLine. I think it's worse on some video cards for some reason.

Also yes DrawText is slow. Whenever I replace it with my Bitmap Font code I always see a speed increase.


ImaginaryHuman(Posted 2008) [#12]
The difference between drawrect and drawimage is simply a matter of whether texture mapping is is enabled. It could be that switching texture mapping off to draw the rectangle and switching it back on to draw images is where the speed hit occurs, whereas leaving it enabled and drawing a small image is perhaps quicker. But otherwise there shouldn't be any reason why drawing a non-textured rectangle is slower than a textured one. Drawing lines shouldn't be that slow either. Maybe it's graphics card dependent?


tonyg(Posted 2008) [#13]
I thought it might be that Drawimage has a surface created already in memory (after the initial set-up but, on testing, I don't see any difference for DX drawimage and drawrect other than the initial drawimage set-up.
What code are you using for your comparison?
It could be some texture switching type shenanigans but, without the code, it's all guesswork.
<edit> P.S. Yes, BlitzMaxMax would be great.


dmaz(Posted 2008) [#14]
I don't know if this logic is accurate or not, but using a bitmap font added 2ms to my draw_screen loop, so I went back to DrawText.


did you test with a release build? I found my single surface bitmap mod to be between 50% and 150% faster (depending on the gfx driver) than the normal DrawText but only in release. In debug it was about the same or even slightly slower.


Trader3564(Posted 2008) [#15]
I found DrawImage works rather nice... it may cost some FPS but it works well. also its sort of required when doing pixel accuare drawing. Textures tend to blur.


Grey Alien(Posted 2008) [#16]
The other good thing about images is you can draw them at sub-pixel coords which you cannot with lines and rects.

@tonyg: In my code I literally swapped DrawRect for DrawImage and reran and checked the FPS. Same for DrawLine. I just re-tested the DrawRect on my home PC and FPS went from 573 to 483.


ImaginaryHuman(Posted 2008) [#17]
Textures and images are the same thing.

You CAN draw rectangles and lines to sub pixel coordinates, it's just that BlitzMax rounds the coordinates before sending the command to the graphics card. A rectangle is nothing more than exactly the same `quad` of geometry that is used to draw an image, the only difference with the image is that texture mapping is switched on rather than just `flat shading`.


Grey Alien(Posted 2008) [#18]
OK so I meant you cannot draw them using native commands.


tonyg(Posted 2008) [#19]
In my code I literally swapped DrawRect for DrawImage and reran and checked the FPS.

So what example code can you provide? I ran something very simple and didn't see a problem.


Grey Alien(Posted 2008) [#20]
Sorry but I can't post the code because it's embedded in something huge and I don't have time to make an example. But it's something I've seen with DrawRect, DrawLine and DrawText multiple times.


tonyg(Posted 2008) [#21]
Sorry Jake... don't have time to create an example?
This...

measure drawimage and drawrect time and took about 2 mins to write. I might have made some mistakes but doesn't suggest a problem on my system.


Derron(Posted 2008) [#22]
tony your example misses one thing ;D

flip


measure the second time AFTER flipping, else the real done work may be missing.

SuperStrict
Graphics 800,600
Local image1:timage=LoadImage("max.png")
Local t1:Int,t2:Int,t3:Int,t4:Int,t5:Int,t6:Int


t1=MilliSecs()
DrawImage image1,0,0
Flip
t2=MilliSecs()
Cls
t3=MilliSecs()
For Local x:Int=0 To 999
  DrawImage image1,0,0
Next
Flip
t4=MilliSecs()
Cls
t5=MilliSecs()
For Local x:Int=0 To 999
 DrawRect 0,0,256,256
Next
Flip
t6=MilliSecs()
Print (t2-t1)+" " + (t4-t3) + " " + (t6-t5)


will do the thing - and the time for drawrect is 2 times as high than with drawimage.


bye MB


Grey Alien(Posted 2008) [#23]
Sorry Jake... don't have time to create an example?
Yep seriously, I was flat out with some stuff right then. Thanks for writing an example, it's interesting to hear MichaelBs findings.


tonyg(Posted 2008) [#24]
So it's not the time to draw we're measuring but time to copy the backbuffer?
Using your example shows Drawimage 3*slower than drawrect on my machine. If I cut the loop to 9 draws then drawrect is drawimage+5ms.
For these results it could besome weird DX thing as I don't get the same results in OGL... but I don't get the same results as GA so no point speculating.
It's important that the person reporting the problem add code showing the issue as there are so many permutations of what could be happening.


Derron(Posted 2008) [#25]
Without Flip you can't compare drawimage and drawrect.

Depending on GraphicsMode the processing (data send to gpu and so on) can be send when running "drawimage" and sometimes only when running "flip".
So it's comparing bananas to apples on how peachy they taste.

But as you mention, including flip in the measurment also measures possible problems in the graphics pipeline.


And yes, I get the same results within my programs as GA. I also exchanged my DrawRects with SetColor + DrawImage (full white sprite) or preprocessed ones. Overall this gave me some percents in total fps (depending on the amount of DrawRects used).

As I remember it was more remarkable in DX7 compared to DX9 and OGL.

And it was the same on NVidia and ATI (consumer class).


PS: console: 5 30 68
(1 Image, 999 Images, 999 Rects)

bye MB


tonyg(Posted 2008) [#26]
.. and my results...
6 44 13
with 7600GT.
In DX mode drawrect is doing trianglesterip vs drawimage trianglefan so shouldn't be a huge difference. Both display a quad one textured the other not.
I thought my results were due to GC kicking in but same results when suspended.
What I am saying is the question :
Why is drawrect/drawline so much slower than drawimage

is going to be answered with... "It depends".
It's going to depend on the code, the graphics card, card settings, Bmax version (I am 1.3.0 non-SVN), drivers etc etc.
If we see different results with same code and same Bmax level then, I guess, we're looking at non-Bmax reasons which is when we'd have to compare HW specs.


Grey Alien(Posted 2008) [#27]
I'm not sure the test code is good unless you use a 256x256 image. When I was using an image in-game it was 48x38 (so a 64x64 texture). Here's another version of the code:

SuperStrict
SuperStrict
Graphics 800,600,0
Local image1:timage=LoadImage("64x64white.png")
Local t1:Int,t2:Int,t3:Int,t4:Int
Const LOOP_SIZE%=100
Const FLIP_AFTER%=0

While Not KeyHit(key_space)
	Cls
	DrawText "press space to start",0,0
	Flip
Wend

'Send to VRAM
Cls
DrawImage image1,0,0
Flip

t1=MilliSecs()
For Local x:Int=0 To LOOP_SIZE-1
  Cls
  DrawImage image1,0,0
  If Not FLIP_AFTER Then Flip -1
Next
If FLIP_AFTER Then Flip -1
t2=MilliSecs()

'Send to VRAM
Cls
 DrawRect 0,0,64,64
Flip

t3=MilliSecs()
For Local x:Int=0 To LOOP_SIZE-1
  Cls
  DrawRect 0,0,64,64
  If Not FLIP_AFTER Then Flip -1
Next
If FLIP_AFTER Then Flip -1
t4=MilliSecs()

Print (t2-t1)+" " + (t4-t3)


The image is a white 64x64 png.

I like to let benchmark apps boot up and prompt the user for a key press to ensure all system processes have "settled down" before running a test.

I've also made sure that the image is sent to VRAM with a test draw before the main draw. I've done the same with the DrawRect command for consistency but it's probably not needed.

With FLIP_AFTER=0 I get 1667 1665. With FLIP_AFTER=1 I get 860 859 with a loop size of 10000.

So I tried another test where I draw multiple of each, then FLIP and I do that 100 times:

SuperStrict
Graphics 800,600,0
SetBlend ALPHABLEND
Local image1:timage=LoadImage("64x64white.png")
Local t1:Int,t2:Int,t3:Int,t4:Int
Const LOOP_SIZE%=100
Const LOOP2_SIZE%=100

While Not KeyHit(key_space)
	Cls
	DrawText "press space to start",0,0
	Flip
Wend

'Send to VRAM
Cls
DrawImage image1,0,0
Flip

t1=MilliSecs()
For Local i:Int=0 To LOOP_SIZE-1
	For Local j:Int=0 To LOOP2_SIZE-1
	  Cls
	  DrawImage image1,0,0
	Next
	Flip
Next
t2=MilliSecs()

'Send to VRAM
Cls
 DrawRect 0,0,64,64
Flip

t3=MilliSecs()
For Local i:Int=0 To LOOP_SIZE-1
	For Local j:Int=0 To LOOP2_SIZE-1
	  Cls
	  DrawRect 0,0,64,64
	Next
	Flip
Next
t4=MilliSecs()

Print (t2-t1)+" " + (t4-t3)

End


I got: 1675 and 1672

Note that I've added in SetBlend AlphaBlend as that's what I was using in my game.

So I'm basically getting the SAME. Fascinating. So something else in my game/setup may have been having an effect on the overall drawing speed somehow, weird because I literally substituted one drawrect for a drawimage! This:

DrawRect c.sx,c.sy,48,48

for this:
DrawImage(AreaAffectedImage.Image,c.sx,c.sy)


and there is nearly 100FPS difference. Go figure.


Derron(Posted 2008) [#28]
Same here... your test: nearly same running-time (1676 1678) but within application it's kind of odd having a remarkable decrease of FPS.


bye MB


tonyg(Posted 2008) [#29]
... profile it.