Optimizations\Speed Tests

BlitzMax Forums/BlitzMax Programming/Optimizations\Speed Tests

BLaBZ(Posted 2013) [#1]
Is there a detailed post displaying the fastest ways to perform operations in BlitzMAX?

For example, list vs map vs array?

Or the quickest way to assign a boolean value?

Thanks!


Yasha(Posted 2013) [#2]
list vs map vs array


These have different performance characteristics. Lists are fastest for iterating over, arrays are fastest for choosing an element by random index, maps are fastest for choosing an element by noninteger key (among other things). They can all substitute for each other, but they aren't the same thing, and depending on how you want to use the collection, one might be better than another. So this is very context-specific.


the quickest way to assign a boolean value


I probably misunderstand, but if there was a faster way than = surely = would use it..?


Anyway, microoptimisations are usually totally unimportant and won't speed up your program. The only real performance wins are in algorithmic optimisations, i.e. restructuring code that's cache-hostile or retreads the same problem space into something tightly organised. There aren't really any magic machine instructions that can take already well-structured code and make it dozens of times faster - BlitzMax already uses the ones it knows about (there are a small number of exceptions, but to get them you'd need to write in C or assembly anyway).

Some ideas for structural optimisations:

-- rewrite widely-branching code to a memoizing or accumulating form (e.g. the classic naive Fibonacci function with two recursions retreads the same problem search for the second subvalue: using an accumulating loop can take trillions of operations off the computation for large values)

-- design operations on memory to favour cache lines (e.g. the old For x ... For y... over an image should probably be reversed to For y ... For x, so that accessed pixels fall in the same locality and don't force a cache miss - BlitzMax might fix this automatically, haven't tried)

-- loop over sorted rather than unsorted collections where possible to be friendly to the branch predictor, or where possible replace If constructs with a pure arithmetic version that doesn't jump (just don't take this too far or you can slow it down again)


But local microoptimisations - e.g. rewriting code that takes five machine instructions and two indirections to use three instructions and one indirection ... will have pretty much no effect at all on modern systems. Especially in general app code. It's usually best to focus on writing clear code (that's more or less the specific thing BlitzMax was designed for), and if you really need fast number-crunching later on having established the clear version is too slow, farm the hard parts out to C or LuaJIT.


xlsior(Posted 2013) [#3]
the quickest way to assign a boolean value


Per BRL, the fastest datatype in Blitzmax is INT, so the fastest way to deal with a boolean is to simply stick it in an int and call it good... Rather than trying to address individual bits in a byte or some other 'optimization'.


zzz(Posted 2013) [#4]
Pretty much what Yasha said.

Blitzmax's compiler is pretty "flat", it wont do much optimizing for you. On the other hand things usually carry through well to assembly.

The one microoptimisation that are worth mentioning is getting rid of division and power operators in time-critical blitzmax code since it wont try and get rid of them by itself. Inverse values and bitwise operators will be much faster if you can make use of them. The tradeback is obviously less readable code, and i completely agree with the suggestion of doing this code in some other language instead.

edit: just wanted to point out that yashas example of x/y loop order is correct, and a pretty good example of how slow non-linear memory access can be compared to linear access.

edit2:


loop over sorted rather than unsorted collections where possible to be friendly to the branch predictor, or where possible replace If constructs with a pure arithmetic version that doesn't jump (just don't take this too far or you can slow it down again)



Just a note here, branching is only bad if its unpredictable. Im gonna go out on a limb and say that the default behaviour is considering a branch to miss since this is how the compiler assembly is usually laid out. Putting in some pseudocode for clarification. Execution could of course continue after the if statement even if the branch were taken, but the important thing is that because of how modern cpus work, they need to know ahead of time which code that will be executed next to run efficiently.