Single surface VS CopyEntity - the big fight live

Blitz3D Forums/Blitz3D Programming/Single surface VS CopyEntity - the big fight live

Skitchy

(Posted 2004) [#1]

Which is faster?
Everybody seems to think that a single surface particle system is faster than using native Blitz sprites. Well, yes, and no.
AFAIK, using CopyEntity() means that only 1 copy of the mesh/texture is actually used and all subsequent copies are just instances of that entity.
There's a LOT of extra maths behind an SS system, and it got me thinking about whether its *really* worth the hassle.
Individual CopyEntity() beings can have their alpha/color altered, and I don't think it increases the surface count. Theroetically, if you are only using 1 texture for your particles, this should result in one surface being used anyway, and the extra speed would be gained by not having to
- rebuild the SS mesh every frame
- send it to the GFX card (only 1 plane needs to be sent)
- sort out the well known z-order problems

To check out the speed, try creating 3000 planes using CopyEntity() - works fine(?)

Let me know if I'm overlooking something blindingly obvious, but I used such a system in my last game and it worked fine. The only noticable slowdown was when there was a lot of overdraw (fill rate) which you get with an SS system as well.

N	(Posted 2004) [#2]

There's also another thing involved in the concept of single surface particle systems vs. sprite particle systems.

In single surface particle systems, you never really have to call any functions beyond those needed to draw the polygons, in the case of sprite based particle systems you have to use Position/Move/Translate/RotateEntity (and such) functions in order to move the entity about. Now, this is the part that might get most people, but that's one of the really big hits on speed. Imagine calling those functions about 8000 times per frame- it's a real hit on the FPS. In Lotus R2 I removed the use of those functions where I could without writing hideously long math functions to get values, and thanks to that it runs very fast. Prior to removing those functions, it capped out at Lotus R1's previous particle 'max' (it's not really a max, it's just a way of describing this concept)- 2000 particles.

The graphics side of R2 is also completely optimized compared to R1. No more surface-per-emitter stuff, now everything is in, at minimum, zero surfaces per texture. Now, each texture can contain one or more tiles which are also particle textures, or it can be an animated particle texture wherein you can animate the particles over their lifespan.

So, it really can be worth the hassle of writing all the math and such, because in the end you can result in having quadruple the particles in existence and being updated (you can freeze emitters and their particles in Lotus R1 and R2, which means if you're far enough away from an emitter or it's in another world zone you can freeze it and it won't be updated at all, but the particles will still exist) at any time.

Jeremy Alessi

(Posted 2004) [#3]

I agree with Skitchy. I did a test with Rob's system vs. copied sprites.

430 sprites and the standard method (non-single surface) was faster by about 4X. This was also updating the sprites and not updating the single surface system. If you don't use CopyEntity() then they are really slow, but otherwise it's usually ok.

dmaz	(Posted 2004) [#4]

Jeremy, are you saying that 430 sprites using CopyEntity is faster than 430 particles from Noel's R2 system or, for that matter, his R1?

or

Are you just saying that 430 copied entities are faster than non copied? If that's the case, hadn't that been established a long time ago?

John Pickford

(Posted 2004) [#5]

A single surface with 430 sprites NOT updated is simply a mesh with 860 triangles. There's no way that's slower than 430 separate objects.

When you update the particles then, depending on the CPU usage there could be a big hit but the single surface has to render faster that 430 separate objects unless something is very wrong.

Bouncer

(Posted 2004) [#6]

Sprites are MUCH slower... test and you'll see...

JoshK

(Posted 2004) [#7]

At low poly counts, instances are faster than single surface, and easier to work with. At high polycounts (>40000) single surfaces are much faster.

Bouncer

(Posted 2004) [#8]

No they're not... even small amount of sprites is SLOW compared to single surface...

Bouncer

(Posted 2004) [#9]

Just did some testing...

2000 particles...

STATIC SPRITES
(no movement, no nothing... just render)
= 85fps

SIMPLE SINGLE SURFACE SYSTEM
(with position updates)
= 220fps

so there's your difference... just displaying sprites is MUCH slower than my whole SS particle system with position updates.

see for yourself...

download
http://www.kotiposti.net/naama/compare.zip (<30kb)

compile TEST_SINGLESURFACE and TEST_SPRITES

Jeremy Alessi

(Posted 2004) [#10]

I'd say Halo is right ... maybe if I had 2000 particles the single surface would be faster but with only 430 the Copied sprites are faster. I got some stupid like 13 fps with 430 single surface sprites and I still get like 50 fps with just 430 separate copied Blitz sprites. I'll have to do some more tests ... but it doesn't seem worth it, unless I'm going to use some ridiculous number of particles.

sswift

(Posted 2004) [#11]

"maybe if I had 2000 particles the single surface would be faster but with only 430 the Copied sprites are faster. I got some stupid like 13 fps with 430 (non-updating) single surface sprites"

As has been said before, 430 non updated single surface sprites means a static mesh with 860 polygons. There is no way a single 860 polygon object was running at 13fps on your system, while 430 two polygon objects was running faster. You have the same number of polygons in both, yet one has multiple surfaces. You obviously did something VERY VERY wrong in your test. Blitz would be unusable if one 860 polyogn object dropped the framerate to 16fps.

Hell, each character in that Arial Antics game probably has more polygons that that. And your terrain definitely does.

Skitchy

(Posted 2004) [#12]

@Bouncer - I think you've conclusively proved that SS is faster with that test. I even tried swapping the sprite for a standard 2 poly mesh but the results are identical.

The question was just theoretical, because I can't see the reason *why* it should be faster to rebuild a mesh every frame - but apparently it is :)

Jeremy Alessi

(Posted 2004) [#13]

Yes there was something wrong.

JoshK

(Posted 2004) [#14]

I don't know about sprites, I was talking about meshes.

I wouldn't use a single sprite in your game. I even merge my flares into single surface systems.

Jeremy Alessi

(Posted 2004) [#15]

Oh, also disregarding any speed issues. Vertex alpha causes major sort issues that I don't see with regular sprites. How do you get around that one?

Defoc8

(Posted 2004) [#16]

A simple test isnt really a great test..you cannot see
the code generated by the compiler - the generic sprite
code is most likely more complex with or without any
fancey operations..its generic...and thats were the problem
lies....I have little doubt that custom systems are faster..

One more thing..the data is always uploaded to the card..
even if only a single quad is stored, that quad must be
sent over and over and over...unless shaders control the
vertex data...the buffers are locked + the vertex data is
altered + the buffers are unlocked..of course there are
hware point sprites, but i dont think blitz makes use of
these..

Anyway..a lot of this is speculation..only mr sibly knows
what the generated code is doing.....

Jeremy Alessi

(Posted 2004) [#17]

Still, the Vertex Alpha problem ... what's the workaround for that?

big10p

(Posted 2004) [#18]

I'm sure I remember Mark (or someone in the know) saying that blitz sprites are just separate quad meshes that are rebuilt from scratch at RenderWorld, anyway.

Matty

(Posted 2004) [#19]

Sprites do have their uses though.

In my Napoleonic wargame I used a single surface system for the soldiers and all particle effects, albeit a single surface for each different class of soldier which allowed me to have over 1000 soldiers on each side on my lowly machine. Because the soldiers were always displayed upright and only rotated about their vertical axis they were
very easy to code in this instance.

However, I made a simple space battle scene which used
copyentity exclusively along with sprites and found that
while I couldn't have hundreds of ships I could still have
quite a few but the real gain was that it was far easier to
program as I didn't have to write my own functions for
things like MoveEntity, TFormPoint/TFormVector, and
making sure that each quad always faced the camera.

Rob	(Posted 2004) [#20]

Single surface is way faster when anything less than 2,000 polys...

There are many ways to do single surface.

Debug mode drastically slows down with typical single surface demos. Check your debug mode...

Jeremy Alessi

(Posted 2004) [#21]

Ah debug ... that was it then. But what about the the vertex alpha sort issues ... did anyone else experience those?

Rob	(Posted 2004) [#22]

vertex alpha is sortable but usually you will just have different systems using different meshes. one mesh per emitter is sufficient. This usually sidesteps alpha issues.

You can sort by building your meshes in order but it will be complex code.

Beaker

(Posted 2004) [#23]

Jeremy - here is one solution to your sort problems:
http://www.blitzbasic.com/codearcs/codearcs.php?code=850

sswift

(Posted 2004) [#24]

Use one mesh per emitter and each frame clear the triangles in the mesh and create new ones in the right order. Or, keep the triangles attached to the same vertices, but change the vertex color/alpha/uv coordinates every frame for each particle. As the ENTIRE mesh gets uploaded to the 3D card every time you change even one value, you're not likely to get much savings by trying to avoid updating certain data in the mesh.

N	(Posted 2004) [#25]

Vertex alpha causes major sort issues that I don't see with regular sprites. How do you get around that one?

Add them to the surface in order of either the inverse Z-distance (this is usually faster but has some graphical side effects that go with it- minor, but they're there), or the inverse squared distance. Goes like this, add closest first, farthest last. Or maybe it was the other way around, I haven't had to write a single surface system in a while, so it left my mind.

Jeremy Alessi

(Posted 2004) [#26]

It's not only vertex sort issues with itself ... it's issues with other alpha'd objects. It actually came up because I was using it in Leadfoot GT and the fence was behind the particles but the particles appeared behind the fence.

sswift

(Posted 2004) [#27]

"It's not only vertex sort issues with itself ... it's issues with other alpha'd objects. It actually came up because I was using it in Leadfoot GT and the fence was behind the particles but the particles appeared behind the fence."

If the center of the mesh containing the particles is nearer to the camera than the fence, then they will be rendered in front of the fence. Otherwise, they will be rendered behind it. If some particles are in front and some behind they'll all render on one side or the other depending on where the center of the mesh is.

This is not the fault of Blitz, it's a problem with how 3D cards render transparent things. It'll be fixed in a few years, but for now, you're stuck with using tricks to try to avoid the problem.

If you have water and your particles are rendering behind it when they are over it, and you don't need the particles to render properly underwater, then you could place the center of the mesh containing the water plane really far underground, and just have the surface of the water above the ground. For a fence though, you'll have to use masking instead of alpha. Masked objects are treated as opaque because there are no partially transparent pixels.

Jeremy Alessi

(Posted 2004) [#28]

Nah, the particles were in front of the fence ... the center of the mesh ... should have been. Hmmm... maybe the mesh is just at 0,0,0. We had this problem in AA though with chunks way out in the desert. Adrian used Vertex Alpha to blend them into the distance and they would appear in front of our alpha PNG fences even though they were a mile behind. Anyway ... it looks stupid and is kinda unavoidable in certain situations.

Ross C

(Posted 2004) [#29]

When moving your emittor centre, move the mesh of particles to the emittor centre using the PositionMesh command. And remember vertexX() and such return the co-ords relative to the meshes co-ords.