[REQUEST] Jump table for Select statements

BlitzMax Forums/BlitzMax Programming/[REQUEST] Jump table for Select statements

Koriolis(Posted 2007) [#1]
It would be a very nice improvment if any Select statement where case expressions are small integer constants could be generated as a jump table. Much like most C++ compiler do.
In some (common) cases this would be a very worthwile optimization.


Who was John Galt?(Posted 2007) [#2]
In some (common) cases this would be a very worthwhile optimization.
Like implementing a virtual machine :)

Yeah I second this one. It would be a nice addition.


Gabriel(Posted 2007) [#3]
Agreed.


Koriolis(Posted 2007) [#4]
Like implementing a virtual machine
By example :)
But believe it or not, it's not for the VM per se, as I code it in C++.


Azathoth(Posted 2007) [#5]
It would be nice if Blitzmax did half the optimizations C++ compilers do.


ImaginaryHuman(Posted 2007) [#6]
Use an array of function pointers with an integer index?


Raz(Posted 2007) [#7]
can someone explain this to a simpleton like me in simple terms please?


Perturbatio(Posted 2007) [#8]
http://en.wikipedia.org/wiki/Jump_table


Koriolis(Posted 2007) [#9]
Use an array of function pointers with an integer index
Problem 1: it's a lot of grunt work and makes the code much less maintainable.
Problem 2: you now have the overhead of function calling, which kinda defeats (to a certain point) the purpose

This is actually what I plan to do, but given these two problems, having jump table generated under the hood easily beats function pointers.


H&K(Posted 2007) [#10]
I agree with Angel that using functoion pointers is the solution. But I agree with you that underthehood lookup would be better.

Shame line number dont work the way they used to
'select
Gosub 9000+20*Var
Used to work wonders before I had select.
Can you have an array of Labels?
'select
Gosub LabelArray(Var)



ImaginaryHuman(Posted 2007) [#11]
In assembler a jump table was easily implemented because you'd have a label which acted as the base of the table and then you'd have some value, either immediately following an instruction or stored in a register, which acts as the offset, and then you'd do `indirect addressing` to go to the location Base+Offset and program flow would immediately continue from that location. That's the optimum jump table.

I very much doubt you can have arrays of labels, labels are not a `type` of data storage, they are compiler directives like they would be in the assembler program. What you ideally would want is an instruction that goes/gosubs to a label based on an offset (which normally is the byte location of the label relative to the instruction jumping to it). You'd need like a GoSelect instruction which is written at a low level, probably in C or something since BRL would want it to be portable.

Calling a function is actually not very much overhead when you're not passing any parameters, not much different than a gosub. So if you plan to gosub you might as well use an array of function pointers.


Koriolis(Posted 2007) [#12]
That's (part of) the problem, what do you do when you have lots of data to pass as parameters? That would precisely be my case.


ImaginaryHuman(Posted 2007) [#13]
It is actually faster to NOT pass the data, but to instead make the data Global and read it directly from within the routine that references it. Passing parameters is a slowdown, caused by putting a contextual barrier between the two parts of your code. You sometimes really dont need that barrier as it means a) you have to transfer the data to the function and b) the function has to read in the parameters, rather than just directly reading the parameters.


H&K(Posted 2007) [#14]
or, simply pass a pointer to the data, so only one parameter.

Also, Ive found that any function has a faster return if it returns somthing, even if its not allocated to any thing


Koriolis(Posted 2007) [#15]
@AngelDaniel: now you're actually making my point. This is much messier than my feature requests.
I appreciate your efforts to give alternatives guies, but I do know them.
This is a feature request, let's just see if I (well, we) end up lucky by getting the attention from BRL.


ImaginaryHuman(Posted 2007) [#16]
Interesting, H&K, about that return thing. I use a Global data pointer which is accessed within the function to get the parameters. It seemed faster than passing even just one parameter as the pointer. Also passing Var was not any faster.

Koriolis, fair enough, I mean, you really do need this to be coded at a low level to be more efficient than the solutions suggested. Good luck.


Koriolis(Posted 2007) [#17]
I don't *need* it, as I said what you suggested is precisely my plan B, but what I ask for would be way better.


MGE(Posted 2007) [#18]
"It would be a very nice improvment if any Select statement where case expressions are small integer constants could be generated as a jump table. Much like most C++ compiler do. In some (common) cases this would be a very worthwile optimization."

hmm... now you have me curious! What caused you to request this feature? How fast is your game running now?

I use Select extensively, and my execution speeds are fine.


ImaginaryHuman(Posted 2007) [#19]
In general jump tables were useful back when cpu's were slower and you needed to save time on complicated calculations by pre-calculating them into a jump table. But after a while the need for a jump table is not very much of an optimization anymore because the raw calculations are so much faster. So I'm not really so sure that you're going to gain hardly anything with a jump table over just using some function pointers or even a regular select-case.


Koriolis(Posted 2007) [#20]
The real problem is that the select statement in BlitzMax is not so "regular". Unlike in C++ by example, each case expression is tested in turn even if the was expessions are small integers. So the more cases you have, the slower it is, it's that simple. It's a classical "linear complixity over constant complexity" issue.
THAT is the problem, and there will definitely be a speed improvment by using a jump table, when you have lots of cases. Most of the time it's not an issue because you need to both have a need for the best speed (you normally only need this in very limted critical areas) AND have lots of cases in your select statement. The thing is that it is precisely my case.

Then, compared to using function pointers, the gain will be much more limited, but will still exist and above all will not require to rewrite the code in an unnatural way.

Also, why were you talking about calucaltions? Aren't confusing what jump tables are? Like sinus table and such? The code I have in my cases statements are not raw computations at all, and I'm even less talking about precomputed data.


ImaginaryHuman(Posted 2007) [#21]
Typically in `days gone by` people would use a jump table to store the *results* of complicated calculation, which includes things like Sin and Cos values, because computing them realtime was too slow.


H&K(Posted 2007) [#22]
I have to say, even though I agree with the "LookUps" are for old/slow cpus argument, an "On X Goto/Gosub/CallFunction" command would be a nice addition.

OK fair enough, once you have started with Arrays of pointers, and have a General solution done, you effectivly have your "On X" command, However with the first letter of Basic being Begginer a nice predefined solution wouldnt go ammis

(Im still tepmted to use Sin/Cos lookups:- But it has been shown to me that somethimes the lookup is slower than just using SIn/Cos, but you are garenteed to have the same result with packaged-pre-caculated look ups, which I belive your not with Sin/Cos)


Russell(Posted 2007) [#23]
Yeah, even Commodore 64 BASIC had this:
10 IF A=2 THEN JUMP=1
20 IF A=500 THEN JUMP=2
30 IF A=1000 THEN JUMP=3
40 ON JUMP GOSUB 100,200,300

100 REM DO SOMETHING FOR NUMBER ONE
102 RETURN
200 REM DO SOMETHING FOR NUMBER TWO
202 RETURN
300 REM DO SOMETHING FOR NUMBER THREE
302 RETURN

Although, in this case, since it is interpreted (rather than compiled), it is most likely added for convenience rather than speed.

Russell


Koriolis(Posted 2007) [#24]
Typically in `days gone by` people would use a jump table to store the *results* of complicated calculation, which includes things like Sin and Cos values, because computing them realtime was too slow.
Except this is not called a jump table. A jump table, as the name suggest is used to transfer control. Sin/Cos tables have nothing to do with the feature I request.


ImaginaryHuman(Posted 2007) [#25]
That basic code does not look efficient - how would it know to jump to 100 200 and 300 based on the index number? It has to do TWO jump table lookups.


H&K(Posted 2007) [#26]
I dont think it does need two lookup tables, (Well if it was compiled maybe).

But in the original interprated the actual listing was the lookup table, and all that happened was that you Gosubed
to Label/line number "Gosubcommand:ptr + Jump"

Thinking about it, actual Line Numbers are lookup tables for label pointers. You are right I suppose that the program cannot manipulate the Labels

Thing is , in the first Basic Compiler I had, (Hisoft)
Gosub 9000+Value*100
Worked. Oh well.


ImaginaryHuman(Posted 2007) [#27]
It needs two lookup tables because you can't translate directly from the jump value (1,2,3) to the line number (100,200,300) ... ie you could have had On Jump Gosub 50,500,1200 - the jump value has no relation to the line number so you'd have to do a lookup in a table or some kind of select-case type affair to convert from the index to the line number, and then jump to the line number.


H&K(Posted 2007) [#28]
I did realise that halfway though the post ;)

But I still posted it as a comment towards your "That basic code does not look efficient", because as I said, one of the lookup tables (Specificaly, the Line number Table), already exists, so you might as well use it.

I would still hold that the 100,200,300 (or 50,500,1200) are one, two and three "Token spaces" after the Gosub, respectivly.

Thats is, although I came to realise that you were right about there being two "Lookup Tables", I disagree that this is inefficient, because both of these tables already exist, the line numbers already have a lookup pointer system, and the Jumpto numbers exist in an interger progrestion right after the command. So you wouldnt need a select case system, you would just look "Index" number of "Number tokens" after the command.


dmaz(Posted 2007) [#29]
just one jump table is needed... H&K is correct with how the command works. if I remember right, the value just branches to the index number of the list following the gosub (or goto). if the value is greater or less than the number of items in the list the program continues with the next line. so the only jump table is line numbers.

[edit] unless you're counting the actual jump using the index?... actually there might not even be one jump table... I'm pretty sure that language was completely interpreted... no pre-compile.

[edit] scratch that, what was I thinking! sure, there had to be a jump table for the lines. don't mind me, it's late ;)

[edit] well, it seems there was no jump table for the line numbers either...
Much like a modern singly linked list, each program line was stored in memory as a line number, a pointer, and then the tokenized code for the line. The pointer contained address in memory of the next program line.
http://en.wikipedia.org/wiki/Commodore_BASIC


Koriolis(Posted 2007) [#30]
Now that Mark seems around again, I figured I might attempt a last *bump*.


Michael Reitzenstein(Posted 2007) [#31]
I'd find this really, really useful. The bottleneck in running BVM is in the interface by far - even when the select statement is replaced by functions.

It's actually the bottleneck to my game as a whole, as well.


Koriolis(Posted 2007) [#32]
Ahem, *bump*.
A "yes, sure" or "no because" would be great...


Dreamora(Posted 2007) [#33]
yes, sure there will be no answer because we as good as never know upfront what future features are implemented :)


Russell(Posted 2007) [#34]
Yeah, Commodore Basic was completely interpreted, but there were a few Basic compilers available for it, including one called 'Blitz'! (No relation to OUR Blitz, AFAIK).

'That basic code does not look efficient - how would it know to jump to 100 200 and 300 based on the index number? It has to do TWO jump table lookups.'

I'm pretty sure the jump value had to be 1,2,3,4,etc and could not be, for example, 1,3,17,23 although I could be wrong. I could see that it would be a bit slower if more 'thought' had to be put into the value of the jump variable (checking to make sure the next value is larger than the previous and smaller than the next and by how much, etc).

This sort of jumping has a direct correlation with jumping in assembler, so it should certainly be possible and even desirable in some cases.

Russell