which is faster using bytes or ints

BlitzMax Forums/BlitzMax Programming/which is faster using bytes or ints

Boulderdash(Posted 2007) [#1]
Here is my interesting question since I am working on my emulator again!!!!!!, and want to optimize it to the "MAX" (No pun intended).

would the resulting assembler code from compiling be faster if 32 bit values are used or would smaller be faster?

If my PC is 32 bit, wouldnt 32bit "Fit" the CPU registers better and work faster?


Perturbatio(Posted 2007) [#2]
Ints are faster (so sayeth the Sibly)


ImaginaryHuman(Posted 2007) [#3]
Depends on how you use the data that is stored in the Int, I think. If you store your data as bytes but you read a whole Int at a type and then use masking etc to isolate each part, that might well be faster than using all Int's since using all Int's is 4 times the amount of memory access.


Scaremonger(Posted 2007) [#4]
How do you perform Int masking into Bytes?


BladeRunner(Posted 2007) [#5]
out = ($44332211 shr 16) & $ff will seperate $33 from the int.

shr 16 kicks the $2211 out (because that are 2 bytes (16 bits) which are moved away.
the remaining $00004433 gets masked by & $ff, which blankens the $44.


QuietBloke(Posted 2007) [#6]
I know memory access is slow but I would have thought that modern CPU's with thier caching and looking ahead and whatever other 'magic' the Processor Pixies do that using ints would be faster then using ints and then shifting and masking them to get the data.

Of course I could be wrong.. Its something I do a lot.


ImaginaryHuman(Posted 2007) [#7]
If you are storing the result of that in a byte you don't need the `& $FF`. The byte automatically trims it. But you probably should use it anyway and store it in a Local Int variable, just because they're faster internally.

To get the first byte: out=$44332211 shr 24
To get the second byte: out=($44332211 shr 16) & $FF
To get the third byte: out=($44332211 shr 8) & $FF
To get the final byte: out=$44332211 & $FF


Pantheon(Posted 2007) [#8]
You can change the order as well:

out = (input & $0000FF00 ) shr 08
out = (input & $00FF0000 ) shr 16

ect...

This is the same method that im using for storing oppcodes in my emulator, and its realy fast. But then I can compare it to using entire ints.

Also as another speedup, avoid function calls. I have only two function calls and 3 masking operations for most instructions.
Also try and use the select statement, not successive Ifs because the compiler can them optimise more.


PGF(Posted 2007) [#9]
Here is a tip - Don't optimise it or worry about fiddly stuff like this until and unless you know that there is a problem.

Write it in the simplest and most generic way first. Test it, then - if necessary - try out different optimisations.

It is a waste of time to over complicate something before you know that you need to.


ImaginaryHuman(Posted 2007) [#10]
I think that's a matter of personal choice, some people like to make a general thing first and then optimize it while others like to get it right and finished with the first time. I tend to be the latter, unless I find after being satisfied with it, that it actually needs more improvement that I didn't expect.


marksibly(Posted 2007) [#11]
Ints, Ints, Ints!


If you store your data as bytes but you read a whole Int at a type and then use masking etc to isolate each part


This can be true - and it can be more convenient, eg: in the case of 'red shl 16 | green shl 8 | blue', but don't then go and do this:

Local red:Byte=color shr 16
Local green:Byte=color shr 8 & 255
Local blue:Byte=color & 255

...the vars won't be 'packed' in any way, and as soon as you need to do anything useful with them they'll have to be converted back to 32 bits (because we're using 32 bit CPUs!).

Instead, stick with Ints...

Local red:Int=color shr 16
Local green:Int=color shr 8 & 255
Local blue:Int=color & 255

So don't think of red, green and blue as 'bytes' packed into an Int, just think of them as compressed ints.


Perturbatio(Posted 2007) [#12]
Hallowed is the Sibly.


ImaginaryHuman(Posted 2007) [#13]
Yes indeed. Your local variables that you use to read memory into, should be int's, even if you only are putting bytes in there.

Mark, how do Long's work? Are they just 2 int's? I think I found that copying Long's from memory to memory was a bit faster than copying Int's, maybe less loop overhead? Do Long's take any advantage of 64-bit buses or registers etc?

I agree you should use Int's for local variables, and then minimize memory usage using the smallest type possible, and move stuff to/from memory at least an int at a time.

Mark, is there any benefit to `pipelining` code, so you aren't waiting around for memory access to be finished from the previous instruction?