Real program Optimization Question Global Vr Local

BlitzMax Forums/BlitzMax Programming/Real program Optimization Question Global Vr Local

H&K(Posted 2010) [#1]
Lets assume I have lots of methods / functions that take this form
Function:Atype(stuff)

        Local Temp:Atype = New Atype

        stuff

        Return Temp

End Function

Now in the type there are several function / Methods that do this, a create function, a addition method etc

It is quite possible to rewrite them all as;
Type Atype

        Global Temp:Atype

        Function:Atype(stuff)

            Temp:Atype = New Atype

            stuff

        Return Temp

        End Function
End Type


So in the first case we have (and I don’t really know how Bmax handles this) a local (Pointer?) / instance of the type. (possibly on a CPU register?). Whereas in the second we have a “single” instance of temp for the whole type, which I assume would benefit from better / not needed garbage collection, but may suffer from some inner workings problem.

So in a real world problem would declaring commonly used temps be faster than redeclaring them each method / function? Would it be faster for ints and the like?

(Note: I don’t think me doing a simple benchmark routine will give me any valuable data, except which is faster in a simple benchmark routine)

Last edited 2010


Who was John Galt?(Posted 2010) [#2]
Not being funny, but if you are concerned with this level of optimisation, Blitz is the wrong language for you. C will put you closer to the hardware.


H&K(Posted 2010) [#3]
Its not really a "concerned with this level of optimisation" thing, its a trivial matter to write the templates (in Blide) for the Types and the methods in either form.

And ATM there is no reason to prefer one way over the other, however if someone says "Ah the global way is loads worse/better, then it would take me , maybe five minutes to write the right template.

I'm not suggesting that I'm going to go back and optimize pre written stuff, just when I go "new type from template", that I have chosen the better one.

I agree with you that there is a school of opinion that says "Finished is better than optimised", but in this case its a trivial matter to chose either of the forms from the get go. And so if someone can tell me why one would be faster, then I can pick the right one either all the time, or each time.

However if you are saying, "Because BMax is a high level Language you don't bother with what's the fastest way to do something" then I disagree.

(This has come up a few times before. The PC is really fast;-so a badly written program will run fast;-so don't bother optimising), but even the people who agree with that would add the caveat, but if you can, then pick the best method at the start.

Last edited 2010


ziggy(Posted 2010) [#4]
Temp:Atype = New Atype 
This nullifies previous instance, so the same work for the GC. Nothing is reused really.


Jesse(Posted 2010) [#5]
Asside from what ziggy said and according to the way compilers work, local variables are "usually" stored in registers while globals addresses are calculated by a process of mathematical calculations and assign values indirectly. something like this:
for global:
    mov ebx,myvar ' theis gets the value stored in myvar which will be the address of myvar.
    add [ebx],eax 'this adds the value stored in register eax to the value store in myvar address

the address of myvar is not a fixed number and must be calculated at runtime sense the program can be loaded at different locations in ram depending on available memory.

for local:
    add ebx,eax  'in this case ebx is used as the variable and eax is added to ebx

moving data from one register to another is alot faster than moving data from a register to an address.

of course nothing is that simple. the main problem is that there are only a limited number of registers and if you are using more local variables than there are registers "available/free" than the variables are going to start to be pushed and popped from a stack and most of the advantage of using locals goes down the drain. In any case locals will be faster than globals for processing.

Last edited 2010


ImaginaryHuman(Posted 2010) [#6]
Yeah you have to create a new instance anyway. However, storing that instance in a local and returning it will be faster than storing it in a global and returning it.


ima747(Posted 2010) [#7]
Out of curiosity what if it was:

Type Atype

        Global Temp:Atype  = New Atype

        Function:Atype(stuff)

            stuff_involving_Temp

            Return Temp

        End Function
End Type


There's only 1 allocation, and no collection, however it's still global (this also assumes that Atype can be re-used and doesn't need to be re-initialized... maybe it's just a function holder or some other utility etc.). I would assume this would largely have to do with how complex Atype is, in that it's complexity (number of fields, etc.) will determine how much work it takes to allocate, and also collect. Where as the global vs. local question is just a matter of how long it takes to address...


Czar Flavius(Posted 2010) [#8]
which I assume would benefit from better / not needed garbage collection
That's not how the garbage collector works. Variables are just references to an object, they are not collected. Each time you come across New, there is a new object. You can jiggle up the variables, but if you need 5 objects you need 5 objects.

As Jesse points out, local variables can be placed onto registers which are much faster. Especially a creation function that will only have one local variable and a few parameters, a local variable is very efficient. Plus having globals scattered around the place is very confusing and even if they were faster I would avoid them for that reason alone. "Finished is better than optimised" I agree (although according to some people I am a micro-optimization freak??) but in this case, not only are locals optimized, they will get your game finished faster, so to speak!

ima, I don't understand what you are suggesting. I'm not a garbage collector expert, but I think the hardest work is finding and keeping track of the dead objects. Once you've found an object, actually deleting it is easy, so I don't think the complexity of the object is signifiant in how long it takes to collect.

Edit: That said, an object which contains other objects that depend upon it for their existence (single reference to them) would cause more work for the garbage collector as there'd be more things for it to manage. One thing I miss from C++ is you can choose whether contained objects use dynamic or regular memory. The latter is faster but the object is destroyed when its parent object dies no matter what. But sometimes that's exactly what you want.

Last edited 2010