Max2d performance?

BlitzMax Forums/BlitzMax Beginners Area/Max2d performance?

GW(Posted 2005) [#1]
I really like playing with the new language features of Bmax, but i've found the graphics performance to be just plain unusable [win32]. I realize that its still beta, but with mac version already released and all versions using the same codebase i'm suspecting that its not gonna change much at release. All of the examples run well, but framerates aren't very good at all. I was considering porting my current b3d game to BMax becuase it would very much benefit from the new language features but I don't think i'll be doing that till I can find another graphics solution. In my tests, I take a large image and break it up into tiles to use as a background map, then in the rendering loop draw all of the tiles out based on a moving central location. The identical code in B3D using Directdraw I get 500+ fps, doing the same thing with pixel perfect textured quads I get 400+ fps. The BMax implementation maxes out at 15 fps! I've tweaked and optimized for 2 days trying to squeeze some more juice out it. I also have updated drivers and have no trouble with other opengl games including Doom3 (meaning that a crappy setup is not the culprit).
Any Ideas of what options are available? Is anyone working on a DirectX based graphics lib (Ogre maybe)? Do you think this is a Max3d issue or maybe an Opengl issue?
I'd appreciate any comments or suggestions.


xlsior(Posted 2005) [#2]
- What make/model video adapter do you have?
- What is your computer's CPU speed?

On 'good' graphic cards openGL is close to the same speed as DirectX, but on lower end ones it unfortunately performs a LOT less.

Especially if you have one of the 'cheap' setups that use an basic integrated video card rather than a seperate AGP or PCI Express card, performance is often less than stellar...

(The game I'm working on now gets between 1400 and 1600 FPS on my Athlon 2800+/Radeon 9600Pro, but between 5-8 FPS on my P600 laptop with ATI Rage Mobility)


Bot Builder(Posted 2005) [#3]
Well if doom 3 can run its prolly ok systemwise

could perhaps be implementation or something... runs fine here.


xlsior(Posted 2005) [#4]
Well if doom 3 can run its prolly ok systemwise


Good point, missed that one.

Something else:

The syntax of the 'graphics' command changed slightly -- you can now also specify the refreshrate, and BlitzMax will automagically try to time its screen updates to fit into this.

Maybe you're trying to specify 16bit color, but are accidentally telling it to limit your framerate to 16 FPS (16 Hz)?

try using just graphics 800,600 without any additional parameters, and see if that makes a difference in the speed you're getting...

Alternatively, what kind of FPS are you seeing after calling this statement:

bglSetSwapInterval(0)

(It will disable the Vertical Wait, and give you the 'real' speed of your screen updates, regardless of the actual refreshrate of your video mode. Although it doesn't seem to work on all cards)


GW(Posted 2005) [#5]
Here's an example.
If you want I can provide a blitz Directdraw as well as B3D pixel perfect spite versions implemented the same way that will beat the max version by X50. If its an issue with rebuilding all the textures in vram every frame then its a Max2d thing and not a vram thing, I have plenty of it and D3d version uses it properly.

Strict
Graphics 800,600,0,0
bglSetSwapInterval(0)

Global BMW
Global BMH
Global tileW
Global tileH
Const NUMTILES = 8	'// 8 tiles per side gives the best performance
Const MOVERATE = 12
Global OX#
Global OY#

Global aMap:Timage[NUMTILES ,NUMTILES ]
Print "loading bigmap.."
Load_Big_map("Image.jpg")	'// use any large image, something like 2600 x 2600 pixels

While Not KeyHit(KEY_ESCAPE)
	GetOrigin(OX,OY)
	If KeyDown(KEY_RIGHT) Then SetOrigin(OX-MOVERATE ,OY)
	If KeyDown(KEY_LEFT) Then SetOrigin(OX+MOVERATE ,OY)
	If KeyDown(KEY_UP) Then SetOrigin(OX,OY+MOVERATE )
	If KeyDown(KEY_DOWN) Then SetOrigin(OX,OY-MOVERATE )

	drawmap()
	DrawText fps(),500-OX,500
	
	bglSwapBuffers
	FlushMem
Wend

Global FPS_fpstime
  
Function FPS()
	Local oldtime
	Local elapsed
	Local fps_fps
	oldtime=FPS_fpstime
	FPS_fpstime=MilliSecs()
	elapsed=FPS_fpstime-oldtime
	If Not elapsed elapsed=1
	FPS_fps=1000/elapsed
	Return FPS_FPS
End Function

'------------------------------------------------------------------
Function Load_Big_map(file$)
	Local pmBig:TPixmap = LoadPixmap(file$)
	Local X
	Local Y

	BMW = PixmapWidth(pmBig)
	BMH = PixmapHeight(pmbig)
	TileW = PixmapWidth(pmBig) / NUMTILES
	TileH = PixmapHeight(pmbig) / NUMTILES
	Print tileW + "  " + TileH
	For X = 0 To NUMTILES -1
		Print X
		For Y = 0 To NUMTILES -1
			Print " " + y
			aMap:TImage[X,Y] = LoadImage( PixmapWindow(pmBig, TileW*X,TileH*Y,TileW ,TileH),-1 )	
		Next
	Next		
End Function
'------------------------------------------------------------------
Function DrawMap()
	Local X
	Local Y
	For X = 0 To NUMTILES -1
		For Y = 0 To NUMTILES -1
			'If (X*TileW > OX) And (Y*TileH > OY) Then
				DrawImage(aMap[X,Y],X*TileW,Y*TileH)
			'End If    		
		Next
	Next
End Function




Difference(Posted 2005) [#6]
I get 1000 on the above code.

Radeon 9600XT 128 MB, Win XP Pro


LarsG(Posted 2005) [#7]
I get between 500 and 1000 FPS on my laptop
(ATI Raden 9600 Mobility (M10), 64 MB)


xlsior(Posted 2005) [#8]
In windowed mode it alternates between 500 and 1000 (mostly 1000), in full screen mode it's a steady 1000. ATI Radeon 9600Pro, 128MB


(Note: On some video cards, there is a pretty major speed penalty for running 3D programs in a window.

If you are seeing slow speeds, try switching to full screen mode as well, and see if that makes any difference. In my own game, I get 1600 FPS in full screen mode, while the same code in windowed mode gives me about 960 FPS.)


teamonkey(Posted 2005) [#9]
You've run out of memory - almost certainly. You have to remember that your images will be rounded up in size to the nearest power of two.

If you've got a 2600x2600-pixel image and are splitting it up in to 8x8=64 blocks, each pixel block is 325x325 in size. These blocks are rounded up in size to the next power of two - 512x512. In 32-bit graphics modes, each 512x512 texture takes up 1MB of VRAM but only 412k of that is tile data. If you run out of VRAM the texture data has to be shunted from system RAM to VRAM each cycle, which is very slow!

I reckon that if you split your image into blocks of a fixed size (say 64 or 128 pixels wide) you'll see a big increase in performance.


xlsior(Posted 2005) [#10]
That does sound pretty likely. Would take up ~64 MB of video RAM. My machine has 128Mb and can run this at a pretty decent speed, but if you have 64MB or less this code is bound to run into speed issues.

(What happens if you force a 16 bit video mode? Wouldn't that cut the memory requirements in half as well?)


LarsG(Posted 2005) [#11]
I've got a 64 MB gfx card, and it ran just fine on mine.. (see above)
note that I used a 2400x2400, 24 bit jpg picture..
gonna try with a slightly larger image (and/or change the bit depth)


teamonkey(Posted 2005) [#12]
Dropping down to 16-bit should help, yes.

It might not store all of those textures in VRAM at one time, but the more memory you have in you video card the faster it will go. In fullscreen mode you've got the front and back buffers as well as texture data (1.8MB each at 800x600x32). In fullscreen mode you've also got a desktop taking up VRAM.

Having said that, it runs OK on my 32MB Radeon 9000M. Not great, but faster than 15fps in windowed mode.


GW(Posted 2005) [#13]
Thanks for the feedback. If those of you who tested it are getting good framerates, then I'll continue to look for the issue here with my setup.
Is there a command similar to the B3d AvailVidMem() for BMax?


teamonkey(Posted 2005) [#14]
No, there isn't an equivalent in OpenGL because it manages its textures in a different way.

What's your graphics card?


GW(Posted 2005) [#15]
I use a GeforceGTS/32m and a Ti4200/128. I swap them out frequently to help guage perfomance as i'm bulding my game.


teamonkey(Posted 2005) [#16]
Hmmm. The GTS might have problems, being an older card with a slower AGP bus. The Ti4200 should have no problem though.


AdrianT(Posted 2005) [#17]
My old GF2 GTS has very poor performance in Bmax. In the old rockout Bmax demo I got less than 5FPS with all the settings on. Geforce 3 has no trouble at all.