Code archives/Algorithms/Compute fast ceil/floor of log base 2
This code has been declared by its author to be Public Domain code.
Download source code
| |||||
This is the output I get on for the example code running BlitzMax 1.50 on OSX Mavericks: Ceil(Log2(x)) using doubles: 409 ms Ceil(Log2(x)) using clog2(): 141 ms Note that these functions will only work for positive integers. | |||||
' --+-----------------------------------------------------------------------------------------+-- ' | This code was originally written by Sophie Kirschner (sophiek@pineapplemachine.com) | ' | It is released as public domain. Please don't interpret that as liberty to claim credit | ' | that isn't yours, or to sell this code when it could otherwise be obtained for free | ' | because that would be a really shitty thing of you to do. | ' --+-----------------------------------------------------------------------------------------+-- SuperStrict ' Example code Rem Local ms%,nms% Local i%,x!,y% Const log2!=0.69314718055994529 local cycles%=20000000 ' Use a long sequence of random numbers of various powers to avoid apparent speedups due to consistent branching local ral% = 1024 local ra%[ral] for local i% = 0 until ra.length local p% = rand(0,31) ra[i] = abs( (1 shl p) + rand(0,$ffff) ) next ' eat some cycles before going on with the important part For i%=0 Until cycles x=Ceil(Log(i)/log2) Next ' test using doubles ms=MilliSecs() For i%=0 Until cycles x=Ceil(Log( ra[i mod ral] )/log2) Next nms=MilliSecs()-ms Print "Ceil(Log2(x)) using doubles: "+nms+" ms" ' test using clog2() ms=MilliSecs() For i%=0 Until cycles y=clog2( ra[i mod ral] ) Next nms=MilliSecs()-ms Print "Ceil(Log2(x)) using clog2(): "+nms+" ms" EndRem Private Global log2_array%[]=[ $80000000,$40000000,$20000000,$10000000,$08000000,$04000000,$02000000,$01000000, .. $00800000,$00400000,$00200000,$00100000,$00080000,$00040000,$00020000,$00010000, .. $00008000,$00004000,$00002000,$00001000,$00000800,$00000400,$00000200,$00000100, .. $00000080,$00000040,$00000020,$00000010,$00000008,$00000004,$00000002,$00000001 ] Public Rem bbdoc: Returns Ceil( Log2( x ) ) when x is a positive integer about: Much faster than using floating point or double functions. Good for when you don't need to know the exact value. EndRem Function clog2%(x%) If (x & log2_array[$00]) Then Return (x<>log2_array[$00]) If (x & log2_array[$01]) Then Return $01+(x<>log2_array[$01]) If (x & log2_array[$02]) Then Return $02+(x<>log2_array[$02]) If (x & log2_array[$03]) Then Return $03+(x<>log2_array[$03]) If (x & log2_array[$04]) Then Return $04+(x<>log2_array[$04]) If (x & log2_array[$05]) Then Return $05+(x<>log2_array[$05]) If (x & log2_array[$06]) Then Return $06+(x<>log2_array[$06]) If (x & log2_array[$07]) Then Return $07+(x<>log2_array[$07]) If (x & log2_array[$08]) Then Return $08+(x<>log2_array[$08]) If (x & log2_array[$09]) Then Return $09+(x<>log2_array[$09]) If (x & log2_array[$0A]) Then Return $0A+(x<>log2_array[$0A]) If (x & log2_array[$0B]) Then Return $0B+(x<>log2_array[$0B]) If (x & log2_array[$0C]) Then Return $0C+(x<>log2_array[$0C]) If (x & log2_array[$0D]) Then Return $0D+(x<>log2_array[$0D]) If (x & log2_array[$0E]) Then Return $0E+(x<>log2_array[$0E]) If (x & log2_array[$0F]) Then Return $0F+(x<>log2_array[$0F]) If (x & log2_array[$10]) Then Return $10+(x<>log2_array[$10]) If (x & log2_array[$11]) Then Return $11+(x<>log2_array[$11]) If (x & log2_array[$12]) Then Return $12+(x<>log2_array[$12]) If (x & log2_array[$13]) Then Return $13+(x<>log2_array[$13]) If (x & log2_array[$14]) Then Return $14+(x<>log2_array[$14]) If (x & log2_array[$15]) Then Return $15+(x<>log2_array[$15]) If (x & log2_array[$16]) Then Return $16+(x<>log2_array[$16]) If (x & log2_array[$17]) Then Return $17+(x<>log2_array[$17]) If (x & log2_array[$18]) Then Return $18+(x<>log2_array[$18]) If (x & log2_array[$19]) Then Return $19+(x<>log2_array[$19]) If (x & log2_array[$1A]) Then Return $1A+(x<>log2_array[$1A]) If (x & log2_array[$1B]) Then Return $1B+(x<>log2_array[$1B]) If (x & log2_array[$1C]) Then Return $1C+(x<>log2_array[$1C]) If (x & log2_array[$1D]) Then Return $1D+(x<>log2_array[$1D]) If (x & log2_array[$1E]) Then Return $1E+(x<>log2_array[$1E]) If (x & log2_array[$1F]) Then Return $1F+(x<>log2_array[$1F]) End Function Rem bbdoc: Returns Floor( Log2( x ) ) when x is a positive integer about: Much faster than using floating point or double functions. Good for when you don't need to know the exact value. EndRem Function flog2%(x%) If (x & log2_array[$00]) Then Return $00 If (x & log2_array[$01]) Then Return $01 If (x & log2_array[$02]) Then Return $02 If (x & log2_array[$03]) Then Return $03 If (x & log2_array[$04]) Then Return $04 If (x & log2_array[$05]) Then Return $05 If (x & log2_array[$06]) Then Return $06 If (x & log2_array[$07]) Then Return $07 If (x & log2_array[$08]) Then Return $08 If (x & log2_array[$09]) Then Return $09 If (x & log2_array[$0A]) Then Return $0A If (x & log2_array[$0B]) Then Return $0B If (x & log2_array[$0C]) Then Return $0C If (x & log2_array[$0D]) Then Return $0D If (x & log2_array[$0E]) Then Return $0E If (x & log2_array[$0F]) Then Return $0F If (x & log2_array[$10]) Then Return $10 If (x & log2_array[$11]) Then Return $11 If (x & log2_array[$12]) Then Return $12 If (x & log2_array[$13]) Then Return $13 If (x & log2_array[$14]) Then Return $14 If (x & log2_array[$15]) Then Return $15 If (x & log2_array[$16]) Then Return $16 If (x & log2_array[$17]) Then Return $17 If (x & log2_array[$18]) Then Return $18 If (x & log2_array[$19]) Then Return $19 If (x & log2_array[$1A]) Then Return $1A If (x & log2_array[$1B]) Then Return $1B If (x & log2_array[$1C]) Then Return $1C If (x & log2_array[$1D]) Then Return $1D If (x & log2_array[$1E]) Then Return $1E If (x & log2_array[$1F]) Then Return $1F End Function |
Comments
| ||
23 Feb 2015: I removed the for loop and simply wrote out each iteration, which resulted in a significant increase in speed. I also tried a few other algorithms but nothing could rival the listed if/then/return statements, probably due to compiler peculiarities. |
| ||
Cool :-) But the answer from clog2() is always 32 :-p |
| ||
In your enthusiasm to unroll your loops you introduced a bug. The ($1F-$00) parts should probably match the array value... eg. ($1F-$11), ($1F-$12), etc. Some interesting stats. On legacy BlitzMax OS X, 32-bit : Ceil(Log2(x)) using doubles: 509 ms Ceil(Log2(x)) using clog2(): 345 ms On bmx-ng, OSX, 32-bit : Ceil(Log2(x)) using doubles: 685 ms Ceil(Log2(x)) using clog2(): 0 ms And bmx-ng, OS X, 64-bit : Ceil(Log2(x)) using doubles: 557 ms Ceil(Log2(x)) using clog2(): 0 ms ...zoooom ;-) |
| ||
Oops! I fixed it. |
| ||
On bmx-ng, OSX, 32-bit : Ceil(Log2(x)) using doubles: 685 ms Ceil(Log2(x)) using clog2(): 0 ms That's interesting. I assume it's optimization. The compiler noticed the returned value was always the same and omitted all the If tests. |
| ||
The compiler noticed the returned value was always the same and omitted all the If tests. No, those were the results using the fixed/working code. But yes, it's probably clang C optimisation - I'm finding some sections of my generated C code perform orders of magnitude faster than Mark's hand-crafted assembler, which balances with the GC being slightly slower. |
| ||
I don't see how optimization could make twenty millions tests run in 0 ms ( i.e. less than 1 ms ). For example, I get these times: original version Ceil(Log2(x)) using doubles: 1325 ms Ceil(Log2(x)) using clog2(): 130 ms artificially simplified version Ceil(Log2(x)) using doubles: 1328 ms Ceil(Log2(x)) using clog2(): 50 ms My artificially simplified version replaces clog2() with Function clog2%(x%) Return x & 1 End Function That's about as simple as it can get while still actually doing something. The call to clog2 was also simplified to no calculations For i%=0 Until cycles y=clog2( cycles ) Next The original times show your machine about twice as fast as mine. But then your clog2 test must run at least 25 times as fast as my simplified do almost nothing test. |
| ||
Ah yes, I see what it's done now :-) The compiler has optimised out the whole call to clog2() because the result is never used. Which all seems rather clever. If I add +y to the end of the line, the times are more realistic. |
| ||
With bruceys NG on linuxLinking:untitled1 Executing:untitled1 Ceil(Log2(x)) using doubles: 1548 ms Ceil(Log2(x)) using clog2(): 0 ms BUT ... I added some lines after the last print in the demo: Print ra[10] Print ra[11] Print clog2( ra[11] ) it segfaults at the 3rd line (seems to be null...). It also segfaults with NG when doing Print (0<>log2_array[0]) So this "null" might be the reason of the 0ms loop time. Edit: optimizing it out ... might be a more valid reason. bye Ron |
| ||
seems to be null... It probably depends when the Global is initialised with data... |
| ||
Yeah, I see... did just test my code with NG ... and it behaved differently to vanilla in that case (opening issue now). Sorry for confusion. bye Ron |
Code Archives Forum