more on memory pools

BlitzMax Forums/BlitzMax Programming/more on memory pools

Defoc8(Posted 2006) [#1]
ok..this may well be common knowledge, but its interesting
anyway. It would seem that released objects are cached,
so that subsequent 'new' allocations will infact simply return
an object from the cache - how well this works for complex
objects i dont know..but this simple sample demonstrates
the bmax memory management in action..well sortof anwyay.

type v3df
field x:float, y:float, z:float
endtype

Local vecs1:v3df[150000]

For Local z:Int=0 To 1
Print "~n~n~n PASS "+z
Local t:Int=MilliSecs()
For Local n:Int=0 To 150000-1
vecs1[n]=New v3df
Next
For n:Int=0 To 150000-1
vecs1[n]=Null
Next
t=MilliSecs()-t
Print "~n time taken = "+t
Next



It may well be that if the objects in the pool are inactive for
a given time, they are released - infact i would imagine this
is how how it works...but regardless, this does atleast
highlight the fact that you may not need to implement
a custom system..

forcing gcCollect() - will trash this, the cache will be flushed. keep this in mind..

a little more info from BRL on how the memory manager
works would be quite helpful...so if your about mr sibly or
mr armstrong..perhaps you could enlighten us :p ;)


tonyg(Posted 2006) [#2]
If you time the create and null seperately then the time saved in the second pass is in the 'null' stage.
Slight saving in the create but most in the 'nullify'
Type v3df
Field x:Float, y:Float, z:Float
EndType

Local vecs1:v3df[150000]
'GCSetMode 2

For Local z:Int = 0 To 1
Print "~n~n~n PASS "+z
Local t1:Int=MilliSecs()
For Local n:Int=0 To 150000-1
vecs1[n]=New v3df
Next
Local t2:Int=MilliSecs()
For n:Int=0 To 150000-1
vecs1[n]=Null
Next
Local t3:Int = MilliSecs()
Print "~n create time taken = " + (t2 - t1)
Print "~n nullify time taken = " + (t3 - t2)
'gccollect()
Next


Is it possible auto-GC runs during the first pass but not the second?
If you gcsetmode 2 then the results are reversed. If you then gccollect (i.e. do what we think auto-gc is doing) then the original results are returned.


DStastny(Posted 2006) [#3]
I think you miss the point of custom pools.

If you look at the source of the garbage collector its quite simple how it works actually. Yes it pools, but not as fast as a custom pool. It pools by size of memory requested for the object, If you look at the example I provided its quite apparent in that test, that custom pool is signficantly faster than default GC memory Pool.

The Garbage collector is invoked by allocations bbGCAlloc. The default behavor is set to automatically sweep after a set number of allocations. The rate is currently set to 500 allocations. So if you allocate 500 objects on the 501s object it will sweep to see if any can be released.

The reason a custom pool is faster is no allocations take place. I think the part you are missing in your example is you are still newing 150000 objects that will invoke the GC. Also you are calling print which with a computed string.
Print "~n~n~n PASS "+z

This too will invoke the garbage collector invalidating your test. Strings are immutable objects so doing any concationation creates new objects. new objects invoke allocation which will then sweep the garbage collector.

The idea of pools is to preallocate what you need before you go into a loop that uses the objects.

The creation of your own pools is to minimize allocations which in turn minimize the chance of a sweep unfortunalty there is not any way to prevent sweeping at all if you use objects like TList and foreach loops as invoking those commands will create objects.

The key is to minimize allocations not sweeps. Allocations are what is slow, because they may cause a sweep. And a custom pool is the fastest way to deal with that, especially if you have lots of tiny objects that you are creating and destroying.

Blitz Max is very fast, Marks allocation routines are fast however there is a lot of code being executed in there, if you are trying to squeeze every bit of performance out of a tight loop. You dont want to execute that.

The garbage collector is not magic, it is invoked by new'ing objects. The part most dont realize that calling a blitzmax command may be new'ing and causing the sweeeps.

If you write simple for loop and count something the GC will never be invoked.

Doug Stastny


Defoc8(Posted 2006) [#4]
budman - im not getting at you - i simply didnt know how
the memory manager in bmax worked..i performed my
own tests too..with a custom pooling system..which was
better in some cases and worse in others..though to be
fair i probably arsed something up.. ;)

i was jst trying to point out that bmax seems to handle
things reasonably well without it..

i was wondering why the 2nd pass takes so little time to
allocate over the first pass...thats all..

sorry if i offended you..
- that wasnt my intention.


DStastny(Posted 2006) [#5]
I was not offended, sorry if I came across that I was. Maybe i need to read my text before I post :).

I was trying to clarify what it does and does not do. I agree BMAX does a real good job of minimize the performance hit. And creating a POOL can be PIA, since you have to actually write more code. However it really can improve your main loops in game code dramitically.

I actually like these discussion, on maximizing performance, BMAX is fast but when needing tight code sometimes you have look for alternatives to optimize.

My example was Vectors since doing math with them creates lots of little objects in the main loop.

Now if we could minimzie the performance hit of the method calls that would be HUGE!. Unfortunatly that requires inlining.

So no worries let discuss performance and extreme optimizations :)

Doug Stastny


Defoc8(Posted 2006) [#6]
you can get around the inlining problem - but you'd have to
customise the editor - adding macro support.
- using a macro like function system, would in effect produce
inlined code..but writing/modding such a system might be
quite tricky.... - maybe someone will create an editor patch
allowing the rest of us to macro our code to death..

The only downside is that it would be none standard, so
passing the code to another developer, would require them
to have the same editor/macro system.

inline methods would be cleaner..


DStastny(Posted 2006) [#7]
A true preprocessor would help, but agree native inlining would be huge! Looking at how BMAX creates the module headers the .i files it should be possible for Mark to put methods marked as inline in the .i as inline resolution. Then the tough issue is making it work with the debugger.

Its just frustrating to have a little method to do a vector add and add all that overhead of context switch of a asm call routine. It makes complex data types in BMAX estentially useless as your better off just coding them out all over the place.

Another thing that would help is actually making Vector2 Vector3 Vector4 and Matrices as integral data types. Be very cool to define vectors like this.

[Code]
local v1:Vector3(1,2,3)
local v2:Vecotr3(4,5,6)
local v3:Vector3
v3=v1+v2
[/code]

That would so rock. If you needed a vertex buffer create an array of Vector3s with contigous memory buffers. Who knows maybe he has something up his sleeve for the 3d module.

Doug Stastny