How many locals?

BlitzMax Forums/BlitzMax Programming/How many locals?

ImaginaryHuman(Posted 2006) [#1]
On today's modern CPU's, how many `Local` variables can I define and be fairly sure that they will all be allocated to CPU registers, rather than begin to spill over as memory variables? Ie how many can be in use simultaneously and still be efficient?


rdodson41(Posted 2006) [#2]
does BMX even put variables in registers if it can? if so i think it just depends on how many registers you have.


Dreamora(Posted 2006) [#3]
Yes locals are given to the cpu

Angel: Don't worry about that. If you have too many of them, they will end in the second level cache and then in the regular RAM, its not like in ASM where it must reside in CPU. Beside that: Don't think you will fill up 512kbyte that easy ;-) (thats the minimum actual have like AMD64 - AMD does not need them that much their connection too ram beats Intel to death - and goes up to 2MB of second level cache)


ImaginaryHuman(Posted 2006) [#4]
Obviously I want to keep as many variables in cpu registers as possible. I know I don't have to worry about how Max handles the situation when there aren't enough to go around. What I want to know is a ballpark figure for how many Locals I can assume will be in registers. Then I can design my code to be more efficient. I know the memory these days is pretty fast but still not as fast. Speed is important.

I thought PowerPC had like 32 registers or something? Same for AMD/Intel?


Defoc8(Posted 2006) [#5]
If your worrying about register usage - i think you must have
your head screwed on backwards...yes speed is important,
but you cannot equate the number of locals stored in
registers to the speed of the resulting machine code..anyway
goodluck.

- If your really concerned about it..im pretty sure bmax
- outputs assembly files as part of its compiler operation.
- personaly i think this is waste of time..


Chris C(Posted 2006) [#6]
the whole idea of compilers is to release you from this kind of maddness, 10 times out of 10 if somthing seems to be taking too long its your algorithm not how many local registers you're using...


Leiden(Posted 2006) [#7]
The original AMD's only had 16 registers, 8 general purpose, and 8 XMM registers. The AMD64 series increased to 32 registers, 16 general, and 16 XMM. Risc processors still have more though,


FlameDuck(Posted 2006) [#8]
On today's modern CPU's, how many `Local` variables can I define and be fairly sure that they will all be allocated to CPU registers, rather than begin to spill over as memory variables?
None.


xlsior(Posted 2006) [#9]
None.


Since you're running on a multi-tasking operating system, there is no telling what the OS chooses to swap where and when.


rdodson41(Posted 2006) [#10]
PPC has 32 regular registers and 32 floating point registers and besides the other ones such as link register, condition register. Obviously some of those are gonna be used, but im sure you don't really have to worry about this, its not going to make much of a differenence. Like Chris C said, the reason higher level languages were developed is so you didnt have to worry about all this stuff when programming in assembly.


CS_TBL(Posted 2006) [#11]
what's going into the cache (say, the 512k dreamore mentinoed) ? the actual vars, or only a reference to those vars?

e.g. does a local array of 2*512*512 bytes cost me 512k or 4 bytes?


ImaginaryHuman(Posted 2006) [#12]
The question of whether there is a point to it or whether it is useful in the modern era or whether there is any worth `worrying` about it is entirely subjective opinion. I wasn't asking for everyone's judgements of the validity of the question, I just wanted to know the number. Flameduck's response is interesting but a worst case scenario for sure. It's good to know that 16 registers is about a reasonable minimum. Thanks for all your observations.


FlameDuck(Posted 2006) [#13]
what's going into the cache (say, the 512k dreamore mentinoed) ? the actual vars, or only a reference to those vars?
It depends. The data cache stores variables (actual data from memory) the instruction cache stores decoded opcodes for expensive operations. You'll rarely actually have 512K of data cache available at any given time. Also note that a value stored in the cache will take up more space than a value stored in memory.

e.g. does a local array of 2*512*512 bytes cost me 512k or 4 bytes?
It costs you a lot more. Depending on the cache used it would probably cost you 3 bytes of cache to store a 1 byte value (1 for the value, 2 for the address/association).

It's good to know that 16 registers is about a reasonable minimum.
All local variables are allocated on the stack, thus there is no way to allocate them directly as registers, particularly since this would limit the compilers ability to optimize and rearrange opcodes. Remember, some processors still only have 4 general purpose registers, and the runtime is going to need at least one at any time, which in a best case scenario means that you can have three variables in play at any given time, before you need to switch in and out of registers.