Faster comparisons

BlitzMax Forums/BlitzMax Programming/Faster comparisons

ImaginaryHuman(Posted 2005) [#1]
If I have an int variable that contains the value 4, would it be faster to do:

If variable>0

or

If variable<>0

??

I know that in assembler, you can check for `nonzero` without having to compare with another value/register, whereas >0 probably needs extra time? or probably both the same on a modern cpu?


Shambler(Posted 2005) [#2]
Just tested it and if variable<>0 comes out faster by a whisker of a whisker after many millions of operations.

It doesn't look like Blitz can or is optimising anything under the hood or that any optimisation has any great effect.


rdodson41(Posted 2005) [#3]
If that is what is really slowing down your code, something is seriously wrong.


DH(Posted 2005) [#4]
X>Y
and
X<>Y
Should be the same time regardless. The processer has built in comparrisons meaning that since this is simply a register to register comparrison that it should be the same time for both (the processer does it simultaneously)

When you compare 2 registers in a processer, it runs each register through a series of AND gates. one side of the gate is going to the first bit of the first register, and the second is going to the first bit of the second register and so on.

When the enable flag goes true on the chip it opens the gates for input. The output is a true of false (for comparrison, so is it equel or not equel). Behind the scenes (elsewhere on the chip) are another series of AND and XOR gates which run through which is greater than the other (IE is reg 1 greater or lessthan reg 2). This is simultaneous as well (ok, well as fast as an AND gate can switch states which is nothing).

So when you ask is A>B the CPU is returning the secondary portion of the compare. When your asking is A<>B or is A=B then the CPU is returning the primary portion of the compare. When you ask is A>=B then it returns an OR operand of both the primary and secondary portions of the compare.

So, there should be no difference between the two.

When making optimizations you must really know what your doing on the CPU. A+B is cheap, A*B is more expensive, Sin(A) is extremely expensive. Why?

A+B is a bit wise XOR operation. If REG1bit1 =1 and REG2bit2 =1 then carry the 1 and output a 0.

A*B is a bit wise XOR (like the addition) but using a multiplier. A+A (b times). Getting around this the cpu uses a base number, does a bit shift, and whalla (a bit more complex than that, but you get the point).

Sin(a) we're not going to go into. It requires a bit of math.

Doing compares are the same way. A comparrison between 2 registers is the same regardless of the way your comparring it. Doing "is a=b" is easy. Doing "is A=B and B=C" is merely doing 3 comparrisons as oppsed to one (first checking a to b, then b to c, then AB to BC)

So if you want to make small optimizations then look elsewhere :-) For example look at loops your doing and how many moves on the program stack your making (IE calling functions, Complex math that can be simplified, or messy comparrisons that can also be simplified).

As a good measure, stay away from comparring strings if you can help it.

Hope this helps!


Snarkbait(Posted 2005) [#5]
If you just want to check if it is nonzero, you can just go:

if variable
...
endif

don't think it's a discernible speed difference tho.


ImaginaryHuman(Posted 2005) [#6]
Good idea about treating the variable as boolean, for tidyness sake, snarkbait.

I understand how processors work. On the 68000 series, however, operations such as branching based on conditions would use automatically set results produced from certain previous operations. You didn't have to explicitly compare two variables in order to set the condition flags, and the condition flags, once set, could be used in the decision of whether to branch - based on a given flag. So you could branch if zero, branch if nonzero, branch if greater than zero, etc... so it was faster to do that than to compare two variables.

I also don't think that anyone can ever claim that a certain piece of code is pointless to be optimized under all circumstances, because you have no idea how many times it is being called or how important it is in the overall picture. This is for a very very time-critical highly-used lots-of-processing routine so little bits of difference really do add up.

It's not so important that I'll lose sleep over it, especially as Blitz seems to not optimize it too well. I just wondered.


skidracer(Posted 2005) [#7]
I also don't think that anyone can ever claim that a certain piece of code is pointless to be optimized under all circumstances, because you have no idea how many times it is being called or how important it is in the overall picture. This is for a very very time-critical highly-used lots-of-processing routine so little bits of difference really do add up.


I would question that mindset completely. After reading your worklog I thought you may have finally abstracted yourself from the machines you were programming.

Once a project is finished maybe, but in development your mantra should always be simplify to reduce complexity not obfuscate to increase efficency.


Robert(Posted 2005) [#8]
I also don't think that anyone can ever claim that a certain piece of code is pointless to be optimized under all circumstances, because you have no idea how many times it is being called or how important it is in the overall picture. This is for a very very time-critical highly-used lots-of-processing routine so little bits of difference really do add up.



I would be very surprised if I saw your code and could not find a more significant optimisation.

This kind of thing is so trivial it doesn't even matter on a Z80, let alone a 3Ghz Pentium 4.


Warren(Posted 2005) [#9]
It's worrying about stuff like this that kills projects.


skidracer(Posted 2005) [#10]
Reading the .s files that BlitzMax produces in release mode in the project's .bmx directory does make some interesting reading. I was thinking on Intel

test eax,eax

was possibly a slight improvement over

cmp eax,0

but the 5 generations of Pentiums you have to consider and of course the plethora of AMD designs makes the consideration seem a bit fruitless


ImaginaryHuman(Posted 2005) [#11]
Mr skidracer, your first comment there is offensive.

I appreciate your opinion but I don't agree with you.

Robert, there does come a point where something can't be optimized much further.

Warren, you're taking it too seriously. It's just a question.