Optimazion issues... Whats with BlitzMax?!

BlitzMax Forums/BlitzMax Programming/Optimazion issues... Whats with BlitzMax?!

Trader3564(Posted 2008) [#1]
Greetings,

I dont meen to spam my game arround the forum, but ye.. im working on it :-)

So, i ran into a new issue. That is when i use DX7 on a 64MB Intel onboard 2.4GHZ Celleron Laptop, with Flip 1, 60hz, 800x600, having used the DrawImage command about 650 times to make the screen, having added the HUD, i get a horrible 50FPS still dropping.
When i remove collisions i gain 5 FPS, when i remove the HUD i gain 5 FPS, and removing basicly any scrolling-correction code gains me 1 or 2 FPS. What im saying is. The way my RPG map is drawn, which exists out of 19*17 tiles, of which 2 layers are effectifly being printed (various tiles over various layers) takes quite a preformance hit. The map is read from a TBank over an TMap, where each tile is 1 byte. I use functions as Mod, Floor, Ceil, While/wend, to create run through the tiles and draw the map, which is gridbased in memory, however, scrolls seemless on the screen, pixel by pixel... Because of the scrolling i can not render it once (e.g. pre-render the map using pixMaps or TImage. Tried that, its horribly slow.) Please take in mind that altough the map looks as if it where once peace, it is made up of zones, those are the size of 19*17... Any ideas?

Im like surprised and: Wasn't BlitzMax supposed to OWN graphical wise? Wheres the Hardware Accelleration?
Ive seen other 2D engines using Hardware Accelleration doing a lot better. Or is it my laptop? If so, is the MHZ, or the limited memory (64MB) or the Intel Chipset? What about Mac's ?! They have 32Mb by default right? This game needs to run on older systems and crossplatform. Not to mention i still need to add "the game". what you see here is just the map and 2 oversized PNG's to make the HUD, which a.t.m. is static.

-Edit-:: Turning off PaintShop Pro which was running in the back, having a "clean" Windows XP Pro running, gained me 5FPS. So it runs at arround 55 FPS, whith all settings as described above.

-Edit2-:: Switching to windowed modes has me end up with 30FPS instead of 55. However, pressing CTRL+SHIFT+D to toggle the collisions layer visibility (cause the engine to draw arround 300 extra tiles) DIDNT make any difference? Which confuses me. Cause if the tiles where to be the preformance hit, then this action should have lowered the FPS. But i noticed that as i go Fullscreen, doing this, i indeed lose 5 FPS. Turning it off again gains me back 5 FPS.)

You can test yourself using the compiled version with configurator:
http://www.fantasaar.com/download/fantasaar%20alpha%200.0.8b.zip

-Edit3-:: Amazing, even switching between 16bit and 32bit makes a difference in FPS. When i set it to 16bit, it gains me 5FPS so it will run at 55FPS.

(UPS and FPS are definitely different, however, in THIS screenshot they happend to be the same. (FPS could as well have been 58 or 44)




Sledge(Posted 2008) [#2]
How are you plastering that text up there? Also, why a TBank and not a 2d array of bytes?


Trader3564(Posted 2008) [#3]
The text is printed using the DrawText command.
The text in the dialog is added with PaintShop Pro to see how it would look. :)


Sledge(Posted 2008) [#4]
Half of me wants to tell you to drop DrawText and use a bitmap font, the other half thinks that true-types are so nice 'n' straightforward to use that it's worth the probable trade-off in speed. Especially as it's an RPG and could easily get away with being 30FPS.


Trader3564(Posted 2008) [#5]
So, using a bitmap font would actualy speed things up? In that case i totaly want to use bitmap fonts. However, please remember that in this demo no text was used except the debugging up/left. So whatever i do, ive got to lose. But generaly speaking, a Bitmap font is faster? What do you recommend? (im keeping an eye on the new Fontmaker for Blide...)


D4NM4N(Posted 2008) [#6]
Wheres the Hardware Accelleration?

I thought blitzmax 2d was actually using 3d hardware?
If your card doesnt support hardware OGL 3D properly then it will be slow. Older (and some newer) Intel GPUs are usually the most guilty of this.


Who was John Galt?(Posted 2008) [#7]
When I was using B3D, B+, I found drawing a lot of tiles really slowed things down. Fill rate wasn't a problem, but splitting things up into a lot of small tiles was. This may be what you're seeing. I'm assuming you only draw the onscreen tiles?


Trader3564(Posted 2008) [#8]
I found drawing a lot of tiles really slowed things down. Fill rate wasn't a problem, but splitting things up into a lot of small tiles was.

Sounds very interesting.. i dont get it to the fullest tough, you meen that drawing them isn't the problem, but aligning them? (the math?) or did you meen the use of an AnimImage? let me know in detail! thanks :)

Yes, i only draw what is on the screen lol. 1 extra row and column tough that is shifted to allow seemless scrolling of tiles.

@D4NM4N, how do i check if this laptop supports OGL 3D? I happen to be able to run with OpenGL drivers here, so i assume that is the best test? meens i support it. No?


Dreamora(Posted 2008) [#9]
Your laptop does not even support DX7 Direct3D hardware acceleration properly (TnL for example).
64MB is at best a GMA900 -> No OpenGL acceleration (Microsoft 1.1 OpenGL emulation)
at worst its something older and does not have any 3D acceleration at all.

BlitzMax is full 2D through 3D, not a single call to 2D (unless you only use pixmap) so no proper 3D card = no proper 3D performance


Who was John Galt?(Posted 2008) [#10]
/\ That will be your problem, then.

Yeah I meant lots of individual calls to drawimage were bad.


Trader3564(Posted 2008) [#11]
Thanks for all the replies! realy encouriging. So its seems to be more of an hardware issue. Also i learned that it is better to avoid the use of 650+ drawimage commands, lol. I will see if it is possible to somehow pre-render zones. But i cannot use pixMaps, as that will be a complete disaster in this case, altough, it doesnt hurt to try that again.

p.s. i included a diagnostics of my hardware.



popcade(Posted 2008) [#12]
If your game uses standard ASCII characters, that's better to use BITMAP text, typically renders faster than BMax's freetype routine(althought it's still faster than Blitz3D).

FONText is completely free.
http://www.blitzbasic.com/Community/posts.php?topic=65909

If you're using Unicode characters like Chinese/Japanese, plain DrawText is the better solution as it's hard to implement an unicode friendly solution yourself.


Czar Flavius(Posted 2008) [#13]
Don't test any performance except with debug mode off - that will show you how it will REALLY run as an end product.


Trader3564(Posted 2008) [#14]
Debugging is off, the debugging flag in the game is that of my own debugging. I dont like compiling blitz debugging as its terribly slow. My own debugging is basicly just showing some variables and hidden objects.

@yoko. but Bitmap fonts give you full possibilities to edit and change the fonts, so why use DrawText anyway? And you can as well just render a font to a bitmapfont.


Bremer(Posted 2008) [#15]
I do not know how your images are loaded. But if you are using one image per tile or a bmax animimage, then it will create one texture per tile, and state changes takes a lot of time. So if you do not already use a system where all tiles are on the same image (texture) and you draw using parts of that texture, then you could potentially gain a good deal of frames from making changes to do so.


Trader3564(Posted 2008) [#16]
Im using bmax animimage. Isnt that the best solution? Whats with the textures? isnt that done using animimage?


Sledge(Posted 2008) [#17]
He means to manage the graphics manually via OpenGL, I think.


Eikon(Posted 2008) [#18]
Don't mess with OpenGL unless you want it to get even slower on the integrated graphics card. Focus on optimization in DX. You could try loading all the tiles as a single image and drawing from the proper section with a custom DrawImageRect command (many are around), but from my experience trying to replace LoadAnimImage never produced a big enough boost in speed to warrant going to the trouble. The fact is that rendering double tile layers or more at 800x600 is going to get slow in any version of Blitz.


altitudems(Posted 2008) [#19]
Maybe you already thought of all this, but just in case.

1. Only draw what is visible on screen.

2. If a certain object is off screen, only update it if absolutely necessary.

3. Make sure that you are reusing (or pointing to) existing tile objects/images instead of creating objects every frame. Ideally you want to have a the same tile object drawn wherever its ID is referenced in the map data. You don't want to store a distinct tile object for each cell in the map.

4. Make sure you separate logic from rendering so you can see where bottlenecks are. Try and Isolate even further if needed.

5. Depending on your design you could make tiles in mixed sizes. EX: one at 16x16, another at 256x256. So instead of drawing an image 256 times you can just draw it once. You don't have to make every tile the same size.

6. I'm not sure if AnimImage creates individual textures for each frame or just changes UVs. If its the former look into a custom technique. I have something that might help if you need it.


ImaginaryHuman(Posted 2008) [#20]
Your fill rate doesn't seem to be too bad since when you switch to the overlay or whatever it doesn't drop much fps, but it's probably that you are switching textures a lot as well. Do you have all your tiles on a big tilesheet and are you drawing them as imagerects, or do you have each as a single image?


Trader3564(Posted 2008) [#21]
I have an tileset:Timage=LoadAnimImage("bigtileset.png", 32, 32, numberoftiles)

for Z... 9
for X... 19
for Y....17
if tileindex <>0 then DrawImage(tileset, tileindex/frame index, etc...) 'only draw when there is actualy something to draw.
next
next
next


ImaginaryHuman(Posted 2008) [#22]
Umm. Each time you switch to a new tile there has to be texture swapping and stuff. You should try to put as many tiles as possible onto as few images as possible and then just draw parts of those images - and make sure you are grouping together the tiles that are drawn from the same `current` image first before switching to another image. That will help somewhat, but maybe not if you are on a purely software-emulated GL implementation.


skidracer(Posted 2008) [#23]
Are you using SetBlend SOLIDBLEND before you start drawing your background blocks and setting it to alpha for the second foreground pass? In the second pass are you skipping the call to draw for blocks that are completely transparent?


Tachyon(Posted 2008) [#24]
My game is somewhat similar to yours. Unhindered I get 100-200fps, but I lock that down to 60. I'm not sure what I'm doing differently than you. I use lots of DrawText (all text on screen is DrawText) and I draw 5 layers to my scene, plus the GUI on top.

I know that OpenGL is just plain faster. You may want to default to that and if the Graphics mode can't be created, fall back to DirectX.


DavidDC(Posted 2008) [#25]
Back in the day we used dirty rects. They a dirty word these days? ;-)


Trader3564(Posted 2008) [#26]
@Tachyon thanks for sharing. Your game runs not as well on this laptop either. How can i check FPS in your game? I know about the power of OpenGL, yet i want to have it run decent on DX7 to.

@skidracer, i think that is at least 1 thing still todo. The tiles are transperant, so i have to use alpha mask, but i will use SOLIDBLEND for the background.


popcade(Posted 2008) [#27]
In response to:
@yoko. but Bitmap fonts give you full possibilities to edit and change the fonts, so why use DrawText anyway? And you can as well just render a font to a bitmapfont.

No specific reason, I just have to display Chinese and Japanese, if there's a fully working and easier solution, I'll use.

I don't use fancy fx or gradients on the text coz it's an RPG.


Trader3564(Posted 2008) [#28]
May i report back that ALPHABLEND is faster!!! then MASKBLEND. i Gained 3 FPS if not 5FPS by just gettin rit of ANY SETBLEND command, and set it once on ALPHABLEND for the entire game. I was surprised.

-edit-
im even more surprised that ALPHABLEND runs even 2 FPS faster as SOLIDBLEND!


Dreamora(Posted 2008) [#29]
thats mainly because you remove the need to switch states. as your GPU has no full hardware TnL, it suffers from childish stuff that even Geforce 1 / 2 cards (4 years older) had no problem with, which is a sad thing but a fact as well.
The fact that your CPU is a bad joke (mobile celeron = weaker than the last P3 most likely ... your cpu compares to a 1Ghz Pentium M at best) just makes the situation worse.


Ross C(Posted 2008) [#30]
Isn't it faster, with tilemaps, to draw everything to the one image, and only draw that image, only updating the master image when something changes? That's how it was in in 2d.


Trader3564(Posted 2008) [#31]
Ye, but but when you scroll each pixel changes :)


nawi(Posted 2008) [#32]
You can still save the current screen to an image, scroll that pixel by pixel and draw the other tiles every frame. When the scrolling motion is complete for one tile-width, update the main image.


Trader3564(Posted 2008) [#33]
I tried this, using pixmaps. But its VERY slow.. its like adding lead to the engine.


Ross C(Posted 2008) [#34]
Hmmm, but if your tiles are for instance 32x32, then you would need too scroll 32 pixels before a change over. When the change over occurs, you simply redraw the main image back onto it's self, except offset at the top, for instance, by 32 pixel, to fit the new tiles. That way your only drawing:

The main image,
And the tiles along the top.

But i'm afraid i don't have bmax, so i may be way off about speed.


Retimer(Posted 2008) [#35]
After checking it out I noticed a couple things:

It doesn't seem like the performance gets any better when I walked to the corner of the map (where less tiles should be rendered).

-Only render tiles that are used/on screen
-If you are using multiple layers rather than ordered rendering (to have the top of the tree to show infront of the sprite), make sure the loop for rendering in there is optimized. Possibly related to your animated tile system as well.
-I noticed a dark effect on blocked tiles as well; are these supposed to be part of a developing shadow system? whatever they are, they might be slowing things down.
-You aren't rendering black tiles for empty space either correct?

In windowed mode (i use maxgui), I noticed a better difference in using the new win32maxguiex with render speed in canvas. And yeah, using pixmaps for something like this is freakin horrible, been there done that.
Ross idea would work but the lag (if you have any) would be shown when you do move, which kind of defeats the purpose; and if everything was moving it would slow things down slightly more because of having to check which areas have been updated and setting values to the areas that have.

Aside from that, I found the game extremely smooth on my end. user-end's with puddy pc's are going to lag regardless of how much you optimize. As long as the game is still playable, I wouldn't halt progress because of this unless someone with a PCI-E graphics card is claiming <50fps lol.


Trader3564(Posted 2008) [#36]
lol


Bremer(Posted 2008) [#37]
Using the idea from Ross with only updating when moving beyond the 32 pixels, I made a test of how to shift pixmaps around. With that method you only have to shift the image and then update the sides when moving beyond the 32 pixels.

This is purely an experiment in the graphical issue and I have not taken into consideration any game logic.

Graphics 800,600

Local pmap:TPixmap = LoadPixmap("img.png")
Local image:TImage = LoadImage("img.png",0)

Local time:Int = 0
Local utime:Int = 0

Local fps:Int = 0
Local fpstmp:Int = 0
Local update:Int = MilliSecs()

While Not KeyHit(KEY_ESCAPE)

	Cls
	DrawImage(image,0,0)
	DrawText("Current FPS: "+fps,8,520)
	DrawText("Last image update time in millisecs: "+utime,8,540)
	Flip False
	
	If KeyHit(KEY_LEFT) Then
		time = MilliSecs()
		image=shiftPmapLeft(pmap,32)
		utime = MilliSecs()-time
	End If
	If KeyHit(KEY_RIGHT) Then
		time = MilliSecs()
		image=shiftPmapRight(pmap,32)
		utime = MilliSecs()-time
	End If
	If KeyHit(KEY_UP) Then
		time = MilliSecs()
		image=shiftPmapUp(pmap,32)
		utime = MilliSecs()-time
	End If
	If KeyHit(KEY_DOWN) Then
		time = MilliSecs()
		image=shiftPmapDown(pmap,32)
		utime = MilliSecs()-time
	End If

	fpstmp :+ 1
	If MilliSecs() > update+1000 Then
		update = MilliSecs()
		fps = fpstmp
		fpstmp = 0
	End If

Wend
End

Function shiftPmapLeft:TImage(pix:TPixmap,shift:Int)
	Local pixPtr:Byte Ptr = PixmapPixelPtr(pix)
	Local pitch:Int = PixmapWidth(pix)*4
	Local moveSize:Int = (PixmapWidth(pix)-shift)*4
	Local offset:Int = shift*4
	For Local y:Int = 0 To PixmapHeight(pix)-1
		MemCopy(pixPtr,pixPtr+offset,moveSize)
		pixPtr :+ pitch
	Next
	Return LoadImage(pix,0)
End Function

Function shiftPmapRight:TImage(pix:TPixmap,shift:Int)
	Local pixPtr:Byte Ptr = PixmapPixelPtr(pix)
	Local pitch:Int = PixmapWidth(pix)*4
	Local moveSize:Int = (PixmapWidth(pix)-shift)*4
	Local offset:Int = shift*4
	For Local y:Int = 0 To PixmapHeight(pix)-1
		MemCopy(pixPtr+offset,pixPtr,moveSize)
		pixPtr :+ pitch
	Next
	Return LoadImage(pix,0)
End Function

Function shiftPmapUp:TImage(pix:TPixmap,shift:Int)
	Local pixPtr:Byte Ptr = PixmapPixelPtr(pix)
	Local pitch:Int = PixmapWidth(pix)*4
	For Local y:Int = 0 To PixmapHeight(pix)-1-shift
		MemCopy(pixPtr,pixPtr+pitch*shift,pitch)
		pixPtr :+ pitch
	Next
	Return LoadImage(pix,0)
End Function

Function shiftPmapDown:TImage(pix:TPixmap,shift:Int)
	Local pixPtr:Byte Ptr = PixmapPixelPtr(pix)
	pixPtr :+ ((PixmapWidth(pix)*4)*(PixmapHeight(pix)-1))
	Local pitch:Int = PixmapWidth(pix)*4
	For Local y:Int = 0 To PixmapHeight(pix)-1-shift
		MemCopy(pixPtr,pixPtr-(pitch*shift),pitch)
		pixPtr :- pitch
	Next
	Return LoadImage(pix,0)
End Function


You will have to supply your own 512x512 pixel image for this test. But it shows the idea and a possible technique for it. And on my computer I am getting a lot of fps as only one image is drawn, and the pixmap shifts takes about 1 millisec or less to shift a 512x512 pixel pixmap. So I doubt that will be the slowdown.

I did not have time to include code to update the sides of the pixmap, but you can paste pixmaps into each other, see documentation under the TPixmap type.


Trader3564(Posted 2008) [#38]
Ok this works, however, doing this realtime (KeyDown()) will make it drop to 30FPS. And this with only a single image. Not to mention it still has to be pre-rendered.
Also, it wont work because all is merged down into a single image so your missing the layers :-)


Bremer(Posted 2008) [#39]
If you do nothing else in the loop than shifting the image as often as possible, then the fps is bound to drop a lot. But in a real situation, you wouldn't be updating more than 30 or 60 times a second, so it would still be plenty fast. Eg. on my computer shifting the image 30 times per second has the loop running at 360+fps.

But I see the issue of the layers, especially if the character has to be able to walk behind trees and such, then this won't work that well. It really only works well if the character is always on the top.

You would always only be drawing one side of the map onto the image when it scrolls beyond the 32 pixels, so update time would never be much. But I don't think this is a solution usable for your game since you require the character to walk between layers. It was also more of a test to see if the suggestion from Ross would work at all.

And for other type of scrolling games it might just be the thing, but probably not for yours.


MGE(Posted 2008) [#40]
Goldstar, first off HATS off to you for developing on low end systems. Alot of coders here do the opposite and get freaked out that their Bmax game runs dog slow and choppy on intergrated boards. ;)

1) I'll say it again, and again and again. For all the graphics you're rendering, layers, etc, etc, a 2D multi layered/sprite world running at 50fps is GREAT! I'd go as far to say even 20FPS would be GREAT!

2) Bit map fonts are just sprites. Make some letters like you're using for tiles,sprites and just draw em on the screen using a routine. (65=A, etc, etc) It will be faster than using the dog slow draw text command. There's some custom routines floating around to draw bit map text. Do a search.

3) Why are you worried so much about such a fast frame rate for this type of game? It would still play fine if it ran at 15fps! Seriously!! If you were coding afast action space shooter, I'd say aim for 60fps minimum, but games like this they do typically run at slower frame rates, on the PC.

4) Ofcourse 16bit is faster than 32bit. Half the video memory usage being thrown around.

5) Are you using CLS every loop? You don't have to if you're drawing the entire screen every loop. ;)

6) Blitzmax is NOT a rendering speed demon. Don't let anyone tell you it is. It's actually fairly slow compared to DX8/DX9 engines. If you're looking for raw rendering speed, there are better solutions out there. The biggest reason to use BMax in my opinion is portability and the fact that the language itself is close to perfection. ;)

7) I just ran your demo on my dev system. It's running from 48-52 frames per second. Give it a rest, it's fine. Now go finish your game! ;)


Trader3564(Posted 2008) [#41]
OK :) I take that for granted then.

*starts on the networking*


Ross C(Posted 2008) [#42]
Bring back real 2d :oP

I certainly look forward to the challenges of bmax when i finally get it.


nawi(Posted 2008) [#43]
By the way, optimisation is generally done with small steps and not major ones, and it is pretty much impossible to tell what to optimize without seeing the code. I used to code a demo or two with old blitzbasic and you can easily double or triple your fps if you care enough. General tips:
-Don't calculate values that you can precalculate (for example before a loop, don't test (while t < func(x)), instead use (s = func(x); while t < s) etc.
-Precalculate mathematic functions like sin,cos,tan to a table.
-Using multi-dimensional arrays was dog-slow in old blitz, and probably in BlitzMax too. Use a single dimensional (or possibly memorybanks, but I haven't tested their speed in BlitzMax)
-Using a function in a loop makes the code more readable, but you will lose some speed. (Often optimization is balancing between readability and speed. In this case prefer speed!)
-Don't use floats if you can use integers instead.
-Use a single loop instead of multiple nested loops where possible. Calculate the reading from and array etc by hand.
-Etc


Perturbatio(Posted 2008) [#44]
-Don't use floats if you can use integers instead.

Don't use floats OR bytes if you can use integers instead.


Dreamora(Posted 2008) [#45]
Don't use anything but Ints if you don't have a serious reason for it (ie you need them or you want to send it over the net)


tonyg(Posted 2008) [#46]
Do all you can to prevent GC running often or having a lot to do when it runs. It might be worth running GC manually so it can be scheduled better.


Trader3564(Posted 2008) [#47]
whats wrong with using bytes instead of ints? in terms of transport bytes are cheaper.


tonyg(Posted 2008) [#48]
It stems from this


Dreamora(Posted 2008) [#49]
There is none performance wise.