Clearsurface + AddVertex slower than VertexCoords

Blitz3D Forums/Blitz3D Programming/Clearsurface + AddVertex slower than VertexCoords

sswift(Posted 2004) [#1]
Someone recently suggested that clearing a surface and re-adding the vertices rather than simply moving the vertcies around would speed up a terrain engine.

I thought this was unlikely, but I tested it anyway, and found it is in fact, not the case, as I suspected.

While Clearsurface + AddVertex is not significantly slower than VertexCoords, it is, nonetheless, slower. And as Vertexcoords is really slow to begin with, it's clearly undesireable to use clearsurface instead, as moving vertcies is so slow that updating a mere 64K of them per frame even with the faster of the two methods, will drop your framerate like a rock. You may lose 10fps or more per 64K vertices. Update 10 terrain meshes, and do nothing else (not even render) and you might be looking at 12fps.

Regardless of whether updateworld is in the loop or not, the times remain the same. This indicates to me that Blitz is not caching the changes, but rather is immediately updating the 3D mesh. Otherwise, one would assume that the 3D model would be cached in ram until such time as an updateworld is called. Then again, perhaps the mesh is not updated at that time, but rather only when renderworld is called. I have not tested renderworld yet. Even if I did, if the mesh is not visible to the camera, I suppose it's possible Blitz will not send the data to the card.

If Blitz isn't caching the data, I'd like to know why not. And if possible, it would be nice to have something similar to lockbuffer for 3d meshes so that you can queue up a bunch of changes and then unlock it to make those changes take effect.

Here's my test code. If you get different results it would be nice to know.

Graphics3D 640, 480, 32, 2


Mesh    = CreateMesh()
Surface = CreateSurface(Mesh)

For Loop = 0 To 65535
	AddVertex(Surface,0,0,0)
Next

For Loop = 0 To 65535
	AddTriangle(Surface, 0, 1, 2)
Next




; Number of times to run test.
TestLoops = 100


 


; -----------------------------------------------------
; 757 milliseconds

StartTime = MilliSecs()


For Loop = 1 To TestLoops  

	For Loop2 = 0 To 65535
		VertexCoords Surface, Loop2, Loop2, Loop2, Loop2  
	Next
	
	UpdateWorld 
	
Next

EndTime = MilliSecs()
TotalTime1 = EndTime - StartTime





; -----------------------------------------------------
; 927 milliseconds

StartTime = MilliSecs()

For Loop = 1 To TestLoops

	ClearSurface Surface, True, False

	For Loop2 = 0 To 65535
		AddVertex(Surface, Loop2, Loop2, Loop2) 
	Next
	
	UpdateWorld
	
Next

EndTime = MilliSecs()
TotalTime2 = EndTime - StartTime

; -----------------------------------------------------





Print Str$(TotalTime1) + " milliseconds."
Print Str$(TotalTime2) + " milliseconds."
WaitKey()

End



Shambler(Posted 2004) [#2]
I know in DirectX I used to

Lock Vertex Buffer ( i.e. a bunch of verts making up a mesh )
Twiddle around with the verts
Unlock Vertex Buffer
Render

You are probably correct in saying that Blitz locks/unlocks the buffer for each change of vertex which would make it much much slower...locking a vertex buffer was a sure fire way of stalling the gpu.

Something like LockMesh() and then a 'VertexCoordsFast' function would be great.

On average your test is 15% faster using vertexcoords.


Rob(Posted 2004) [#3]
There is however, a question of efficiency.

Consider a single surface particle system.

By clearing the frame each time, you maximise effeciency, and in this case, you could recoup the 15% speed loss.


sswift(Posted 2004) [#4]
I don't understand what you mean.


Abomination(Posted 2004) [#5]
(That's another fine mesh you've gotten me in to!)

But what are we timing?
I suppose the cpu is not waiting for the gpu to do his thing (or is it?), so we are timing the routines VertexCoords and AddVertex?
And if its the gpu thats slows things down, than that should be the _pu to test.
don't know if we can though.


jhocking(Posted 2004) [#6]
"By clearing the frame each time, you maximise effeciency,.."
"I don't understand what you mean."

I don't understand either. "The frame?" What frame?


mrtricks(Posted 2004) [#7]
I made a multitasking system for my program - it makes a list of all vertices to change, and changes them a bit at a time, as many as it can do within say 5 milliseconds. That way it takes about a quarter of a second to make a change, but the frame rate doesn't judder. However, that's for a building destruction system, not an LOD or particle system.


Rob(Posted 2004) [#8]
Sorry let me clarify.

By frame, I meant what you see in the game world. Whats on screen.

If you set all your limits initially and prebuild them before using them (single surface particle systems) then you will bottleneck in a real game situation. Players move from location to location, things explode. Demands are very dynamic, and that is where building stuff from scratch actually benefits you.

If you had a fixed number that you had to modify, you would need code to hide unused triangles at 65535 units away from you - and this code could take up as much as 15% of your time. Maybe more.


Eole(Posted 2004) [#9]
Sswift you want to make a dynamic terrain LOD ? I spend a lot of time on terrain algo (ROAM, GEOMIMAPPING, QUADTREE etc ...), but actually with the recent graphics card the brute force methode come back.


sswift(Posted 2004) [#10]
"If you had a fixed number that you had to modify, you would need code to hide unused triangles at 65535 units away from you - and this code could take up as much as 15% of your time. Maybe more."


I see what you're saying. But I did not mean to imply that one should create a bunch of meshes at the start of the game for each type of particle and then let them sit there. And even if you did, you could hide them.

What I'd be more inclined to do is create a mesh when an emitter is created, and as particles are emitted, add vertcies and triangles to it as needed, reusing those which are no longer needed. You'll reach a point of equilibrium for most effects where like a max of 200 particles are seen at once, and no more, at which point they fall off again. When the emitter stops emitting, you delete the mesh.

I think that would work just fine. Of course I'm simplifying things a bit. Another optimization is to re-use those triangles which have moved offscreen, or too far from the camera as well. Then if you have snow falling all over the level only those flakes near you will actually be using polygons. The rest if the particles will just be simulated data.


sswift(Posted 2004) [#11]
"Sswift you want to make a dynamic terrain LOD ? I spend a lot of time on terrain algo (ROAM, GEOMIMAPPING, QUADTREE etc ...), but actually with the recent graphics card the brute force methode come back."


No... I was talking about something EpicBoy was doing. I just wanted to know how fast it was because I did try something like this when developing my current terrain system to dynamically weld tiles together and it was much too slow. I thought maybe there was a faster way than what I was doing, but apparently not.

My current terrain system btw, does do LOD, but it uses a set of static meshes to do it.

I saw the pics of your terrain, and they look nice. You did what I considered doing at one time... used ROAM to optimize the terrain rather than to update it in realtime. I thought that would help with the performance of my terrain system. But the sheer number of surfaces I had because my terrainw as divides up into a grid of entities, meant that that was much more of a bottleneck than the number of polygons. So I never bothered optimizing further. Besides, collisions were really slow when dealing with tons of entties with tons of polygons, so I needed to be able to interpolate the player's height instead of using collisions. Also the changes in level of detial messed up collisions.

LOD for that reason is problematic, but if you want long view distances there's really no alternative to using LOD. But you shouldn't use dynamic LOD. That's clearly too slow as this test shows. Just use static meshes.


Gabriel(Posted 2004) [#12]
If you mean my comments about clearing and rebuilding being faster, I wasn't referring to a terrain engine, I was referring to a general purpose single surface particle and entity system I wrote a year ago. I did mention to EpicBoy that he MAY find it faster to clear and rebuild, and indeed he did find it slightly faster. I did also mention that your mileage would vary considerably according to what you were doing and possibly your system and Blitz version.

I agree with your observation that Blitz seems to change vertex positions immediately rather than do them all together, which would be massively faster for this sort of thing. If vertices don't get updated in animated meshes until you call UpdateWorld, it seems logical to me that dynamic mesh deformation ought to work the same way. At the moment, it appears not to be.


Eole(Posted 2004) [#13]
ok, I tried to update it in real time but it s too slow, so I decide to make a static terrain system with a good geomimapping (more detail is add where mode detail is needed)

I create my proper Algo to generate the terrain, I base it on QuadTree (so the heughtmap must be : 2 power n +1 ), and Geomimapping error calcul to deternime where detail is needed

Actually I work on the editor, and on the slope lighting system.

Go to www.vterrain.org


Rob(Posted 2004) [#14]
What I'd be more inclined to do is create a mesh when an emitter is created, and as particles are emitted, add vertcies and triangles to it as needed, reusing those which are no longer needed. You'll reach a point of equilibrium for most effects where like a max of 200 particles are seen at once, and no more, at which point they fall off again. When the emitter stops emitting, you delete the mesh.


Thats exactly what the single surface particle systems in my sig and cubemap demo does. It only adds triangles up to what you're using and recycles.

It's still a bit messy and the convenience of fire and forget tempts me a lot more than 15%


Trixx(Posted 2004) [#15]
Sibixsus, using ClearSurface/AddVertex method in no way can be faster than Vertexcoords method . I had spent more than 100 hours on optimizing my particle system ( which, by the way will be released soon ) and there is 20-30% difference in performance between them. As sswift said, the best option is to combine those two methods. And, very important - if particles change colors ( or alpha ) over life, then you must use VertexColor in addition to AddVertex every frame, and vertexcolor is very "expensive" too. With vertexcoords, you can choose to change colors/alpha only when needed !


sswift(Posted 2004) [#16]
"With vertexcoords, you can choose to change colors/alpha only when needed!"

It occurs to me that if the alpha is tied to the color, which is to say, it changes at the same rate (but need not have the same minimum and maximum values) then one might be able to optimize things a bit more if one could quickly find free vertcies whch they know were previously set to the values that they want them to be set to, so you don't have to update them. Of course you would still have to update some.

I suspect this would actually be slower from all the extra processing and I wouldn't want to bother with it myself. I did try something similar with blitting 2D... blititng only those pixels that have changed... a "difference buffer" I called it. But it was actually slower than just blitting normally.


big10p(Posted 2004) [#17]
I certainly find using vertexcoords to be faster. For instance, today I was working with a mesh that contains 10000 separate quads, every vert of which had to be repositioned every frame. Using vertexcoords I got 38fps, but rebuilding the mesh from scratch I got 22fps.


Rob(Posted 2004) [#18]
Would be nice if Blitz update had small optimisation for Vertex Color and Vertex Coords as it's now in common use.

Knowing Mark though - it's fast already!


Eole(Posted 2004) [#19]
Please read this :-)

http://www.gamedev.net/reference/articles/article1842.asp


Ross C(Posted 2004) [#20]
I'd imagine that you would check to see if an emmitor is off screen. If so, clearsurface and rebuild. It only happens once, for every emittor the goes onscreen/ off screen. And you don't want to be rendering the whole mesh, even when only 10% of it is onscreen, or whatever.