Can you help with High sprite count optimisations?

BlitzMax Forums/BlitzMax Programming/Can you help with High sprite count optimisations?

Arowx(Posted 2009) [#1]
Hi I found that my Ludum Dare entry (source+ windows exe in zip) which quickly generates a lot of sprites ran into major slowdown when the 'Wall of Doom' Bricks multiplied to over 1000!

I put in a timer for frame and logic [top right of screen] and it looks lik the drawing just slows down rapidly once your past 1000-2000 bricks.

Are there any optimisations that I can add that would allow a screen of thousands of sprites at 30-60fps?

Currently I am just using SetScale and DrawImage for each sprite object with ALPHABLEND.

As well as the tricky to grasp but works well default BlitzMax Collision system!

This is going to be my entry into the Community Framework Compo so I'd like to make something very fast and cool!


GW(Posted 2009) [#2]
Do you have to use Bmax?
I hate to admit it, but there is are techniques you can use in Blitz3D that can raise your sprite cap by x10.


ImaginaryHuman(Posted 2009) [#3]
The community competition is for BlitzMax code only.

Merx, BlitzMax's collision stuff is per-pixel, and that can get to be quite time consuming when you have a lot of collisions. You might look at optimizing your collision detection routines perhaps by using a grid or some kind of spacial partitioning.

Beyond that you'll need to look to putting multiple sprites onto a single (or few) image/texture so that you don't have to keep swapping textures.


Brucey(Posted 2009) [#4]
Do you have to use Bmax?

Haw... I can guarantee that well written BlitzMax code will be faster than Blitz3D.


Arowx(Posted 2009) [#5]
@ImaginaryHuman - Yes I currently also don't crop the drawImage calls to only onscreen due to the collision detection being pixel based!

Cheers for that, so fast collision detection...

The Blocks do tend to group together could I tile the image over a rectangular areas of them can blitzmax do that or will I just need larger graphics pre-rendered?

What about the new DX9 driver is that stable enough and faster?

I also do a lot of scaling as the bricks grow, would it be faster to pre-render the scaling?

What about state changes e.g. scaling, colours, alpha would sorting and batching these produce faster results?

What about threading, is it worth looking into threading to divide and conquer e.g. Logic, Graphics, Sound, Input, AI?


GW(Posted 2009) [#6]
Haw... I can guarantee that well written BlitzMax code will be faster than Blitz3D.

I totally agree.
But using jims SpriteMaster ss library will beat the Bmax drawing commands by a lot.
I don't know of an equivalent method for Bmax.


Jesse(Posted 2009) [#7]
I believe you would get a significant speed inprovement if you use prerendered tiles as well as adding a self made rectangle collision. I read some threads on how ineficient imagescollide is for basic collision. You have the advantage that all you are using are box and circle collision which are about the simplest and fastest collision detection for games. Even if you decide to leave it to the hardware to scale, if you can calculate the corners of the scaled image by your self you can have the speed increase just by creating your own collision function. most optimized collision do a distance check first:
if abs(x2-x1) > n then return ' n is usually the radius from the center of the object/image
if abs(y2-y1) > n then return ' to the fardest edge/corner of the object/image


Once close enough a final collision check is performed.


slenkar(Posted 2009) [#8]
yes I dont think imagescollide does a distance check so you do have to do it yourself


Pete Rigz(Posted 2009) [#9]
DrawImageRect is what you want for tiling the same image over an area, which looks like it'd help a lot in your case. Nice game concept btw!

I'd be interested to know why b3d would be faster then bmax seeing as it shouldn't be too difficult to replicate what b3d does in bmax. But I doubt it'd be any different speed wise other then bmax being a more efficient compiler. I know you could create your own single surface system in b3d as the native sprite command would create its own surface for each sprite which wasn't very efficient. But even with single surface you're still pumping vertices to the gfx card every frame which is basically the same thing that throttles bmax.


Arowx(Posted 2009) [#10]
@Jesse excellent that should work a treat as all the sprites have a game position and calculate their graphics position!

@Pete Rigz DrawImageRect sounds just Ideal!


MGE(Posted 2009) [#11]
Is your code using a proper fixed logic or delta? Going from 60-30-20fps is no big deal and is very common when there is lot's of rendering. It's really only noticeable when the coder does not have a proper timing routine.

Scaling takes a hit, rotation takes a hit. Granted on more modern hardware the hit is very small. And yes, pre rendered images will help, but not that much to be honest. I wouldn't do it personally.

Remember to draw all images that don't have alpha with the proper stage flag. (Solid?) You'll gain some speed there.

State changes are a small hit. It might be worth grouping all alpha, all additive blending, etc, but again..I wouldn't do it.

Depending on the driver, dx9 will be alot faster. But I would still default to dx7.

My engine uses a single surface rendering system in Blitzmax. It's alot faster on older cards, not worth the hassle on newer cards. At least that's my opinion.

I've never used the blitzmax built in collision checking. I've always used my own software rect, circle collisions. Which work well for 99% of the games most peeps will ever code and it's blazingly fast. When using rect or circle, make your hit zone slightly smaller than the sprite so it overlaps when colliding. Workls best for shooters. ;)

Do you have to collision check every loop? Sometimes you can spread out the load over 1-3 frames and "get away with it". A trick used more than peeps realize. ;)


Arowx(Posted 2009) [#12]
@MGE excellent thanks for that...

So to summarise, I am going to try and optimize the game by adding:
1. Collision detection logic
2. Fixed Rate or Delta Timing
3. Tiled Images on grouped sprites
4. Test pre-rendered tiles versus scaled

?No one has mentioned threads yet anyone using threads within a game to take advantage of all those CPU cores?

I'll repost as I go with benchmarks and downloads!

So benchmark mode for game needed... I'll be back


Brucey(Posted 2009) [#13]
You can't thread graphics rendering. Well, as far as I know.

Although you could of course have a single render thread, with other stuff going on in other threads.


ImaginaryHuman(Posted 2009) [#14]
One possible advantage of doing scaling of sprites in realtime rather than pre-rendered, is that if your `original image` (with no zooming) is say 64x64 pixels, and you want to zoom it UP to 128x128, you are a) using less texture memory, b) utilizing the cache better because you're accessing the same pixels from the texture quite often, c) you don't have to read lots of pixels which effectively contain the same color values as their immediate neighbors due to the scale operation. However if you're scaling DOWN, then pre-rendered zoom frames may be faster.