BMax feature request - ability to inline a macro

BlitzMax Forums/BlitzMax Programming/BMax feature request - ability to inline a macro

Grey Alien(Posted 2006) [#1]
Hiya, have a look at this general function and code:

Function ccRound(flot#)
	Return Floor(flot+0.5)
Function

a# = ccRound(1.2)
b# = 2.3
c# = ccRound(b#)


It would be really great if a keyword such as Macro could be used somewhere when defining the function so that instead of a full function code being made each time ccRound is called, the code is automatically inlined (fleshed out) by the compiler. I could use this to great effect to speed up my particle routines no end.

What do you think?


Dreamora(Posted 2006) [#2]
Don't know if the speed difference is really that large.
BMs internal stuff (especially the math) uses pure C power and I don't know if you are able to create anything within BM (with GC) where inline actually is able to make a "real difference". (I mean more than a few milliseconds per a million calls and other stuff that simply depends on your CPU dynamic stepping reaction)

But generally it would be nice if somewhen something like this would happen.
As well as a macro / alias keyword generally.


Grey Alien(Posted 2006) [#3]
The above function is just a small example. I have a function with 10 params and several lines, and I need to call it a several times in a row in a wrapper function. Surely inlining those function calls so that the compiler just uses local variables instead of passing them to a function would be lots faster esp. when the I'm calling the fuction up to, and posss. beyond, 1000 times per frame?

So yeah, a macro keyword would be WAY cool.


Robert(Posted 2006) [#4]
Surely inlining those function calls so that the compiler just uses local variables instead of passing them to a function would be lots faster esp. when the I'm calling the fuction up to, and posss. beyond, 1000 times per frame?


No. Inling function calls is a micro-optimisation, only to be used when the higher level stuff is efficient as it can be, and certainly the overhead of 1000 function calls should be very small - assuming that BlitzMAX is not doing anything daft behind the scenes.

It has been known for a long time that calls to subroutines are the most common type of intruction executed in software programs, so they have been extensively optimised.

Perhaps if you were to explain how your particle routines work, we could suggest higher-level ways of improving performance?


DStastny(Posted 2006) [#5]
Actually Grey is correct the context switch caused by calling a function is huge hit. CPU optimizations aside inlining is still a hugely effective way in optimize and improve performance. Look at any Vector Template in C++ its all inlined for performance reasons.

Doug Stastny


gman(Posted 2006) [#6]
am i missing something here or shouldnt this be easy to test?
Framework BRL.Blitz
Import BRL.Math
Import BRL.StandardIO

Function ccRound(flot#)
	Return Floor(flot+0.5)
EndFunction

Local i:Int=0
Local start_run:Int
Local end_run:Int
Local loop:Int=10000

GCSuspend()

For Local loopcnt:Int=0 Until 10

	Print "-------"
	Print "loop: "+loopcnt
	Print "-------"

	start_run=MilliSecs()
	DebugLog "Start (func): "+start_run
	For i=0 Until loop
		a# = ccRound(1.2)
		b# = 2.3
		c# = ccRound(b#)	
	Next
	end_run=MilliSecs()
	DebugLog "End (func): "+end_run
	DebugLog "-------"
	Print "Time (func): "+(end_run-start_run)
	Print ""
	
	start_run=MilliSecs()
	DebugLog "Start (no func): "+start_run
	For i=0 Until loop
		a# = Floor(1.2+0.5)
		b# = 2.3
		c# = Floor(b#+0.5)
	Next
	end_run=MilliSecs()
	DebugLog "End (no func): "+end_run
	Print "-------"
	Print "Time (no func): "+(end_run-start_run)
	Print ""
Next

GCResume()

my codes probably flawed somewhere, but i get a significant difference consistantly between the ones with the function and the ones without the function. i do every so often get one where the no func is greater. obviously at lower loop counts the difference is less. good design cant be beat and if there is a way to reduce the amount of looping its obviously a good thing. as would most any language though, i think BMAX would definately benefit from inlines.


Defoc8(Posted 2006) [#7]
Ive made this post before. I also talked briefly with
skid/simon via e-mail about the abililty to inline assembly
code - he basically said, he would like bmax to head in this
direction..but that it wasnt his call. So it maybe that mark
wants to avoid low level stuff creeping directly into bmax..
which may confuse new developers, and result in lots of
community code that is both hard to read and less portable..
I dont know, im jst trying to think of some reason why these
features may not be attractive to mark..personally i dont see
a problem with it..

Id also like to be able to edit assembly and C++ files,
directly inside the bmax editor, so i dont have to jump
between ide's - syntax highlighing and parsing for these
would be great..would save a lot of time..perhaps this
could be developed as another module, i wouldnt mind
paying for it.

then again - these are all wishlist items, and id prefer to
see the 3d module before any of this was considered.

:]


ozak(Posted 2006) [#8]
It's usually the compiler that's inlining functions as part of the optimization step, so it can be done without adding macros :)


Defoc8(Posted 2006) [#9]
Thats assuming that the compiler works like that, being
able to mark something as inline in your code, making it
explicit..is handy, even if this is only a suggestion to the
compiler. Macros have other uses, not simply for expanding
a functions...unrolling loops. You need only look at whats
possible with macros in C++.
Inline assembly on the other hand, gives you as much control
over code as your going to get - dont assume that jst cuz
a compiler can do this, that and the other, that it will...and
despite modern compilers being very good...they arent
perfect.


Grey Alien(Posted 2006) [#10]
yeah I want an explicit Macro call just to be sure (I'm not worried about inline assembly as I "gave that up" on my Amiga when I got Blitz Basic 2!). Also I made an erroneous statement, the function may be called 1000 times *per* logic loop and my logic runs at 200FPS! So that's 200,000 times per second. Well worth optimising.

Robert: thanks for the offer of help but I'm fine with the high level optimisations (object pooling, use arrays etc).

gman: Gman good test, I get 3-4 times faster with no functions. Thanks.

Really it's a shame we still have to think like this as developers becasue in the old days it was all about code optimisation, but these days it's more about readability/expandability/maintainability and so on, and thus I'd rather use a function...but not if it's gonna slow down my particle functions hugely.


Damien Sturdy(Posted 2006) [#11]
I could really use this :)


Beaker(Posted 2006) [#12]
Is there no way to support something like this?:
http://www.nothingisreal.com/gpp/


Dreamora(Posted 2006) [#13]
Sure it is. We have the BMK sources and thats the only thing needed to add new preprocessor based commands.