Hmm, which is faster?

BlitzMax Forums/BlitzMax Programming/Hmm, which is faster?

SculptureOfSoul(Posted 2006) [#1]
Let's say you have a method that is called extremely often. In general (I'm sure there are exceptions) is it faster to declare a local, let's say, integer, and use that, or is it faster to add an extra integer field to the type and reuse that with each call to the method?

At first I thought it'd be faster to use the pre-instantiated field value since each call to the function wouldn't allocate a new local variable. But a local variable is going to be allocated on the stack and so therefore should prove faster to access. So in general, is one method faster than another?


bregors(Posted 2006) [#2]
.


Brendane(Posted 2006) [#3]
It's probably quicker to use a local if it's just an int. Since it's likely that the memory around the stack will be in the processor cache more often than some data section area where a global resides.

(if it's an object and must be created I expect it will be quicker to have that as a global so it only needs to be instantiated once).

You can only be sure if you perform some timing tests however. The rule should be, write your code, if it needs a speed improvement, then profile it and target the problem areas - otherwise don't waste your effort.


SculptureOfSoul(Posted 2006) [#4]
Hmm, interesting test result BrEgOrS, although your test doesn't exactly replicate the scenario I was envisioning, because in your 'local version' you still only declare the local variable once and then enter the for loop (and only call the method once). The scenario I was curious about would be where the method would be called 1000's of times and each time it would allocate/de-allocate a local.

Anyhow, I tried running your test code as is and got results similar to yours although reversed (the field method was slightly faster, although not significant).

However, when I switched the code to more closely match my scenario, the local version was almost 4x faster!

Try running this code and let me know what you get for results.
Type TestType1

	Method TestMethod1()
		
		
		Local now = MilliSecs()
		For Local n=1 To 100000000
			Local TestInt
			TestInt = (2*TestInt+10)/13.0*1000-1
		Next
		now = MilliSecs()-now
		Print "TestMethod1 time: "+now
		
	End Method

End Type

Type TestType2

	Field TestInt

	Method TestMethod2()
		
		Local now = MilliSecs()
		For Local n=1 To 100000000
			TestInt = (2*TestInt+10)/13.0*1000-1
		Next
		now = MilliSecs()-now
		Print "TestMethod2 time: "+now
		
	End Method

End Type


t1:TestType1 = New TestType1
t2:TestType2 = New TestType2

t1.TestMethod1()
t2.TestMethod2()


My results were as follows:
TestMethod1 time: 5317
TestMethod2 time: 19058 (yeah, my comp sucks)


bregors(Posted 2006) [#5]
.


Dreamora(Posted 2006) [#6]
I'm missing version 3: Create a global in the method ie a static that will remain there and just needs to be set to 0 again (so don't use global someInt:int = 0 but global someInt:int : someInt = 0)


Brendane(Posted 2006) [#7]
"I was curious about would be where the method would be called 1000's of times and each time it would allocate/de-allocate a local"

Bear in mind that there is no 'allocation' to speak of if you are talking about a local integer variable - the stack pointer is simply advanced, creating space for the local variables to exist.


Brendane(Posted 2006) [#8]
"Hey, could someone with a Core 2 Duo run this code and show us your results, I'm just curious."

This code runs in a single thread - having 2 processors won't help you (except you'll get better performance because the other processor will be running other threads in the system).


Brendane(Posted 2006) [#9]
To satisfy your curiosity :-

Core 2 Quad core - 2.8Ghz results from the code posted by SculptureOfSoul :-

TestMethod1 time: 1398
TestMethod2 time: 5575


plash(Posted 2006) [#10]
I get:
TestMethod1 time: 9344
TestMethod2 time: 9917

Why are my times a lot closer than everyone else?


ImaginaryHuman(Posted 2006) [#11]
I would think also the local version is always going to be faster because it will keep the data in the cpu register where possible and not even access main memory or cache. To get access to the the field in the type, Blitz at a low level has to get the pointer to the type and then add an offset to it to get to the field and then read the field, versus reading it straight from a register, so it's likely to be faster just to use a local, as test results imply.

Putting something into object orientation and into a type and making fields is all nice and pretty. It's good for *spacial organization*, but is it not necessarily the best solution for *temporarl organization* ie making best use of time. Oop is not usually as efficient, for most things, as procedural code or good use of locals.


Pantheon(Posted 2006) [#12]
How will the variables hold up as they are returned from the function because the Field wouldnt need to be returned yet the local would. Any speed difference or would that be negligable?


Brendane(Posted 2006) [#13]
A return in this case is merely leaving the result in the EAX register (presume we are talking about x86 here)... like AngelDaniel says, it all depends on what the code does as to how BM generates it. In some circumstances the value can be kept in a register and never written into memory at all, or maybe it's not already in a register and must be read back from it's memory location - it all depends on the rest of the function's code. Memory accesses are slow and unpredictable - pages might be swapped out to the pagefile, memory might not be in the processor cache, other processes might be accessing memory causing your process to stall.. the results can vary every single time you run your code, let along on a different pc with different hardware and different software/drivers/other software running.

If you want to understand that better it's a good idea to learn a bit of assembler and delve into the code that BM generates.


SculptureOfSoul(Posted 2006) [#14]
Any particular books or tutorials you'd recommend for guickly getting up to speed on assembler. I don't have the time to really look into it at the moment, but definitely plan on learning enough to help myself optimize (not only in the normal sense of the word, but in regards to code layout/variable usage/etc) in the future.

Thanks.


plash(Posted 2006) [#15]
... assembler ...
Not to be rude or anything, but I believe its 'assembly'.


Brendane(Posted 2006) [#16]
"Not to be rude or anything, but i believe its 'assembly'."

Ah, the pedantic programmer.... is there any other kind?


"Any particular books or tutorials you'd recommend "
I learned back in the old days, z80 8086, 68000, 80386 etc. then MIPS for a PS1 / PS2 - so I can't really point you to anything specific. But once you get the idea pretty much all processors do the same thing. I recommend looking for a tutorial which explains the x86 instruction set and 32bit registers. Then look for tutorials which show how compilers generate assembler constructs for functions/loops/switch statements etc. you'll learn alot.


plash(Posted 2006) [#17]
Ah, the pedantic programmer.... is there any other kind?
I suppose not :)


SculptureOfSoul(Posted 2006) [#18]
Thanks Brendane. It's now on my "to learn list", along with Lua, Lisp, advanced socket programming, etc etc ;)


Blueapples(Posted 2006) [#19]
Not to be rude or anything, but i believe its 'assembly'.


The terms are used interchangeably. Basically, it doesn't matter:

Note that, in normal professional usage, the term assembler is often used ambiguously: It is frequently used to refer to an assembly language itself, rather than to the assembler utility. Thus: "CP/CMS was written in S/360 assembler" as opposed to "ASM-H was a widely-used S/370 assembler."

http://en.wikipedia.org/wiki/Assembly_language#Assembler


Brendane(Posted 2006) [#20]
@SculptureOfSoul :-

Here's another tip which can teach you alot - get a debugger. OllyDbg is excellent and free and will be invaluable as you start learning x86 assembly language -

http://www.ollydbg.de/