DrawImage Optimisation

BlitzMax Forums/BlitzMax Beginners Area/DrawImage Optimisation

tafty(Posted 2006) [#1]
Hello All,

I'm coming close to finishing my first BlitzMax project and have started testing it on a few friends PCs. They've expereinced a bit of slow down particularly when there's a significant amount of images being drawn.

What I'm wondering is does BlitzMax optimise the order in which images are drawn using DrawImage? Or would it improve performance to implement my own SpriteEngine type that stores the images to be drawn during the program loop, orders them by z-order, image name (i.e. texture), y-order and then draws them prior to the call to Flip?

Or would all the above be massively over complicating the matter and I should just accept the fact that I should try and reduce the number of particle images in my explosions?!

Cheers,

Gareth.


tonyg(Posted 2006) [#2]
What makes you think it is the image drawing that is slowing the process down?
There's lots of possible optimisations but they'd be redundant if you're already following them.
As a test, what have you done to reduce the number of particles and what were the results?


Grisu(Posted 2006) [#3]
A source code to observe is mostly helpful... :)


FlameDuck(Posted 2006) [#4]
Tell them to upgrade their drivers.


tafty(Posted 2006) [#5]
@tonyg:
Thank you that's a very good point. I really need to benchmark various parts of the logic too. This is a basic and I feel a little silly for not having tried it already but I've been programming in a bit of a void on this so your comment really is very helpful!

During some of the explosions there is also a release of a game object (TAnitFluff) that would increase the number of collision detections occuring so this could also contribute. I'll check this out later.

@grisu:
I'm at work right now and my source code is at home but a general overview of my processing would be as follows:

My TGame type has a render() method called after all game logic is complete that calls the render() methods on each of the currently active TGameState types. There is only actually only one TGameState type active during gameplay - the game itself - called TGameStatePlay.

TGameStatePlay calls the render() methods of the following objects (with further objects called by these render() methods indented):

TParticleManager:
-->TStarfield (three layers, about 60 stars in total): Iterates through a TList Of TImages
-->TExhaust (the player ships exhaust): Iterates through a TList of TImages
-->TExplosion(s): Iterates through a TList of TImages
TFluffoid(s):
-->Iterates through a TList of TFluffoid with one associated TImage
TAnitFluff:
-->Iterates through a TList of TAntiFluff with one associated TImage
TPlayer:
-->Iterates through a TList of TFluff (a bullet class; upto 5 bullets allowed each with one associated TImage)
-->The player's TImage
THUD
-->Draws the score, etc

Do you know what? The above has been an excellent exercise because laying it out like that has made me realise that I'm more or less sorting by the image name already due to the fact that each object type is iterated through sequentially. There are differences in colour, rotation and scale though. So chances are that when I get to try some benchmarking tonight I'll find a delay in the logic cycle...

Many thanks - I'll let you know how I get on!

@FlameDuck:
The battle cry of developers everywhere - I'll be sure to do that too!

However, I am still interested in a generalised answer to my original question: has anyone found positive performance impact by strictly sorting calls to DrawImage? Perhaps using the criteria I originally specified or some other criteria? Maybe even taking into account changes in rotation, scaling and colour (which would make some sense as this would reduce the number of programmatic calls to the rendering pipeline)?


Dreamora(Posted 2006) [#6]
The impact is little unless you are changing images. BM only sets new states if they changed (and even if it does, the gain you get by batching is normally smaller than if you just used regular setcolor etc without batching code)

Chance is higher that they have no real 3D card (Intel, SiS, S3) or that they have a low bandwidth card like Radeon 9000 or ATI IGP ...
Another potential factor is the driver you used. For that reason i would add an option at the beginning that ask if the player wants to use OpenGL or DirectX. Because DX7 is quite a little old, NVidia user might have the better performance with OpenGL ...


tonyg(Posted 2006) [#7]
This might help.
In addition, a couple of years ago, it was found quicker to draw objects sharing the same image in groups. Unfortunately that post was lost .
You can have 1000's of rotated, scaled, coloured, alpha'd images displayed at the same time.
The rules change a bit when you get 1000's of particles as well. Depends on their size, whether you delete/create them or pool them.
Check a current thread for pooling objects as it might help. The same pool is valid for particles.


Grey Alien(Posted 2006) [#8]
I benchmark logic separately from drawing. In Bmax the drawing seems very fast on my setup unless I chuck out tons of particles, but I think the delay is more down to the processing required to loop through, process and send each particle image to the graphics card rather than the actual drawing. In the past, with non-accelerated graphics, the delay *was* caused by drawing, but that's rare now imho.


tafty(Posted 2006) [#9]
Well it turns out that when you explode everything on screen and create 1500 new particles plus assorted detritus all at once it leads to a noticeable dip in performance - who'd-a-thunk-it!?

But following the implementation of some object pooling and cutting out the unnecessary collision detection that was taking place it's now behaving much better. I may yet try and reduce the number of particles too.

Just a couple of tweaks to do and I'll have version to post soon.

Many thanks to all that took the time to reply!


Dreamora(Posted 2006) [#10]
On Collision: there is one important thing on this topic: Only put receive collisions on those objects that REALLY need it. There is no intelligent reason for example in most cases, why bullets should receive collisions. They only send it when they hit something. In that scenario, most likely only player and targets have receive.
If you have walls on which the projectiles explode: let the walls receive the collision, they are static and you know exactly how many of them are there and you can it often even optimize quite far by using the rect command instead of the image based!