Global Variable Speed up/down

BlitzMax Forums/BlitzMax Beginners Area/Global Variable Speed up/down

H&K(Posted 2006) [#1]
I was bored so I went though the archive and found a c++ vrs Blitz comparison by Nikko

;--------------------------------------Blitz
t=MilliSecs()
For i=1 To 1000000
If i=1 Then j=2
If j=2 Then j=4 Else j=5
j=j+256
y#=y#+Cos(i)/100 'Changed to y:Float for max
Next
Print MilliSecs()-t

So, I copied it over to blitz max and ran it, without debug (110 Blitzmax, 105 Blitz3D)
Odd I thought, so I put some Globals at the start

Global t:Int,j:Int,y:Float,i:int (Global t,j,y#,i)
113 BlitzMax, 102 Blitz3d

So I deleted the y:float and Y# from the global
(BlitzMax 93, Blitz3d 102)

Deleting I:int
(max 114, 3d 103)

Putting y:float back (global t:int,j:int,Y:float) and (global t,j,y#)
(max 89, 3D 103)

etc

Conclution,

Blitz 3d is about the same result if I declare globals or not

BlitzMax on the other hand slowdown or speedsup depending.

Questions.
1) In Strict mode, is the program going to run slower because I have declared my variables?
2) Whats going on?


Dreamora(Posted 2006) [#2]
1) Not really. It even has some very important Pros: locals only exist in the global you declare them. Thats something that non strict won't do (at least did not in earlier versions)

2)BM is higher optimized. This means that it tries to hold locals directly within CPU which will make them much faster. globals on the other side are held within RAM and need to be transfered to CPU first.


B3D did not make this distinction and was always on the same speed.


ImaginaryHuman(Posted 2006) [#3]
The thing here is that you should declare that your variables are Local, otherwise they will be assumed to be Global from what I understand. At the moment your code is not using Locals at all.

Local t:int,j:int,Y:float
Local t,j,y#

That should be even faster.


H&K(Posted 2006) [#4]
No.
If I declare them as all local, its the same speed as if I had not declared them at all.

But, as local outside the loop and global are the same thing, I didnt expect it to make a difference

BUT... If I do declare T J and Y as Global its faster

The reason for my post, is that I tend to declare Loop Variables as Global (i.e. Loop1,Loop2,Loop3), Obviuosly (after reading these posts) its quicker with
For Local Loop1 =1 to ......
But, does loop1 exist after the end of the loop?


Dreamora(Posted 2006) [#5]
no it does not exist. It only exists within the scope of the loop.
But you could declare it local outside the loop


H&K(Posted 2006) [#6]
Why cannot it be optimaised to Hold the loop in the cpu wheather its a global or a local?
But Ha, its not realy a problem


Dreamora(Posted 2006) [#7]
Because globals are on a different stack than locals ... If this would not be done this way, BM would end as slow as B3D.


H&K(Posted 2006) [#8]
But if you look at the numbers in the test above BMAX is SLOWER than Blitz3D.
I agree that there is some form of optimation going on in the background. But Surly it shouldn't optimize to slower than Blitz3D.
One way to solve this would be COPY the loop variable to the Local Stack from the Global Stack. So that a loop variable is always conciderd a Local Variable, even tho its has Global Scope.


ImaginaryHuman(Posted 2006) [#9]
Note that Cos() returns a Double. I'm not sure how Blitz3D did variables or whether it had the Double type, but a Double is twice the size of a Float which might have something to do with it.

Try precalculating Cos() into a lookup table on each version and see how they compare.


Dreamora(Posted 2006) [#10]
Define i as float or give it to the cos as float at least *float(i)*, in that case it should operate on float arithmetic which is highly optimized (not possible for double on the same level, as it is twice as large as what the CPU on 32bit systems accepts) and thus several times faster.


H&K(Posted 2006) [#11]
I had been forcing float arithmatic with /100.0
And if you define I as Float it slows down.


FlameDuck(Posted 2006) [#12]
But if you look at the numbers in the test above BMAX is SLOWER than Blitz3D.
The test (and the variances) are statisticly insignificant.

First of all, it contains 2 'empty' if statements (If i=1 and If j=2) so is a better indication of how often a branch prediction misses (and thus the CPU pipeline has to be flushed), rather than of which is better at crunching numbers.

And if you define I as Float it slows down.
From what I understand, the instruction cache does not cache FP instructions.

If you want to test speed with a more reliable benchmark, try Jim Browns collection of "Sieve of Eratosthenes" programs. In short, BlitzMAX is about twice as good at crunching numbers as Blitz3D and PureBasic, and about 80% as fast as C++.