Surface count slowdown...

Blitz3D Forums/Blitz3D Programming/Surface count slowdown...

big10p(Posted 2003) [#1]
I read somewhere that slowdown form using high surface counts is (mainly) because the GFX card has to keep using different textures to render the various surfaces. So, if I wan't to render a load of un-textured entities, will the surface count slowdown be largely reduced?

If not, what other factors cause a high surface count to slowdown the FPS?

Thanks in advance!

-big10p


Sunteam Software(Posted 2003) [#2]
In the interests of replying, I don't actually know, might be worth you doing some tests yourself and posting results.


Rob Farley(Posted 2003) [#3]
I could be wrong, but even if an object doesn't have a texture it will still have at least 1 surface. If the object has a few materials on it to give it different colours then it will have multiple surfaces and therefore texturing would be more efficient.

If you've got a bunch of objects using the same texture however, you can addmesh them all together to make a single surface entity, of course you've got the poly count/surface count trade off there and all the objects are stuck together.


Rob(Posted 2003) [#4]
You need to group as many meshes by the same texture.

Blitz will need a seperate surface to display a seperate brush setting/alpha/shininess/blend/fx or texture.


Anthony Flack(Posted 2003) [#5]
That's actually a very good question though - if surface slowdown is primarily caused by state changes on the video card, caused by the need to switch textures, what happens if you're not using textures?

And is drawing consecutive objects that use the same texture any faster than drawing different ones? And if the answer to both of the above is no, then is this a possible avenue for future optimisation?


simonh(Posted 2003) [#6]
Testing is the only answer.


big10p(Posted 2003) [#7]
OK, I'll try and do some testing tomorrow. Thanks for replying!


GNS(Posted 2003) [#8]
I coded a little test that creates 2000 (untextured) cubes and positions them at random places on the screen (always within the camera's view, however). TrisRendered() shows 24,012 polys being rendered. The framerate is a mere 16-18 FPS.

Now, if the assumption thus far has been that surface count slowdown has been because of texture switches what would account for the slowdown in this situation? Fill rate?


big10p(Posted 2003) [#9]
I'm not sure. Do you think 16-18 FPS is slow? I don't know what spec your PC is so I can't tell.

I would be interested to know what FPS you get if you had a variety of textures which you randomly assign to the cubes. Also, what speed you'd get if you plonked all the untextured cubes into a single surface mesh and displayed that. You could view the scene from a high distance in order to rule out fill rate as a culprit, I guess (or have I completely misundertood the meaning of fill rate all this time? possibly :) ).


Gabriel(Posted 2003) [#10]
When I was testing this with quads, I found that them being untextured actually made them slower than the textured variety. Didn't make a lot of sense to me, but hey, that's computers for ya.

So yeah, it's still surface dependent. Or my tests indicated such, in any case. I don't know how videocards work, but I'm guessing it still sees a plain colour as a texture.


sswift(Posted 2003) [#11]
"Now, if the assumption thus far has been that surface count slowdown has been because of texture switches what would account for the slowdown in this situation? Fill rate?"

No, surfaces. Even though they're not textured differently, Blit still creates treats the surfaces as different.

The only optimizarion Mark could do is to check to see if the last surface rendered has the same properties (texture, color, alpha, shininess) as the current surface being rendered.

This may provide a large speed up in your demo, but in most games it would not provide a big speedup becuase entities can't be storted by material, they have to be sorted by distance. But there are certain cases, like particle systems, where this might provide a big speedup IF the user knows about it and what to avoid doing (like changing the color or alpha of the particles... only animating the texture) I have written to Mark aobut this, so maybe th enext version of Blitz will have something like this, but he never replies to my emails anymore so I don't know what his thoughts on this are. I wouldn't count on the opitimizaton appearing in the next update though.


GNS(Posted 2003) [#12]
big10p: I ran the test on my P3 1Ghz with a GF4 MX 440. If I combine the cubes into one mesh with one surface (using AddMesh) the framerate jumps up to 65-68 FPS. Quite a bit improvement.

sswift: I was under the assumption surfaces were slow because of the underlying texture switching going on?


big10p(Posted 2003) [#13]
GNS: sorry, for some reason I read your 16-18 FPS as 16-18 milliseconds - it's very late. :)

Anyway, It seems from what ppl are saying that the surface count slowdown isn't just texture related. Oh well.


sswift(Posted 2003) [#14]
"sswift: I was under the assumption surfaces were slow because of the underlying texture switching going on?"

They ARE. But the "texture switching" code is called for each surface wether or not you actually change textures for each surface.


sswift(Posted 2003) [#15]
Oh and "alpha switching" and "color" switching and "shininess" switching are also done per surface regardless of whether those change.

At least, that's my limited understanding of how things work from what I've learned from talking to Mark about it.


GNS(Posted 2003) [#16]
I wonder if the texture switching code has to be called regardless of surface state? If it was possible to somehow only update surfaces whose properties have changed some performance could be gained for static objects.


marksibly(Posted 2003) [#17]
Hi,

Blitz checks for redundant state changes, so if you set the same texture twice in a row the hit is minimal (ie: Blitz's fault!).

But transform changes (ie: different entities) also cause a big hit, so the more the entities you have - even if they share the same texture - the bigger the hit.


sswift(Posted 2003) [#18]
There's a TRANSFORM change too?

So I guess what you're saying is you already thought of the optimization I suggested a long time ago, but it doesn't give much speedup between entities because there's a transform change to worry about as well as a texture change.

Damn.



But Mark... Something doesn't make sense about what you've just said.

Above there's a demo which takes some spheres, all textured the same, and either uses addmesh to create a bunch of surfaces in a single mesh, or leaves them all as individual meshes. And those who have tested that code claim that the results are the same for both.

This would seem to indicate that the speed hit between surfaces is the same as the speed hit between entities, even when using the same texture.

So I have to wonder:

1. Where is this texture hit optimization you speak of, because that should be speeding up the addmesh example.

2. Where is this entity transform hit, because that should be slowing down the example with individual entities.


When you take both of these together, they indicate that the addmesh example should be a lot faster than the seperate entities. But it is not.

Why?


sswift(Posted 2003) [#19]
The test I speak of is here:

http://www.blitzbasic.com/bbs/posts.php?topic=23182


GNS(Posted 2003) [#20]
Some slightly off-topic things I've found:

In my test I used AddMesh. CountSurfaces() revealed that there was only 1 surface in the final mesh. This seems to say that AddMesh only creates new surfaces if the properties of individual entities differ from each other (from my tests, things like different UV coordinates, etc. cause new surfaces to be me created) but that's to be expected. There wouldn't be much point to AddMesh, that I can see, if it simply created a new surface for each individual entity. The speed hit would be the same, as instead of having 2000 individual meshes, each with 1 surface, you would have 1 mesh with 2000 surfaces.

Also, AddMesh doesn't seem to create a new surface once the vertex limit (65536 was it?) is reached. It simply gives an 'Illegal Memory Address' at Renderworld.

What I wonder now is how do other games/other engines work around the speed hit caused by transform changes? Tribes 2, for example, features maps with hundreds of tree meshes, buildings, infinite terrains, etc. and still manages to pull off an acceptable framerate. Obviously objects can be combined into single surface entities but with the transform change causing such a large speed hit surely there has to be some tricks or workarounds being used?