Best way to calc next ^2 value...
Blitz3D Forums/Blitz3D Programming/Best way to calc next ^2 value...
| ||
Blimey, I should really know how to do this - I seem to be suffering from brainfade today. :/ Anyway, given a random integer value, how would you calculate the next highest ^2 value, but only if the value isn't already a ^2 number? I did come up with the following - which I cobbled together from stuff in the docs. It seems to work, but I'm not too sure how (maths not being my favourite subject) :P Function pow2(x) Return 1 Shl (Ceil(Log(x) / Log(2))) End Function So, is there a simpler way? |
| ||
That formula is correct. I haven't been able to think of anything simpler. I prefer verbose function names and called mine round_up_to_a_power_of_two(). :) "log(x)/log(2)" tells you to what power you'd need to raise 2 to get x "ceil()" rounds it up to a whole number "1 shl y" is a sneaky bitwise way to raise 2 to the power of y |
| ||
lol, thanks octothorpe. Can't believe I got the right formula by stabbing in the dark. Well, there was logic to how I came up with it - I just can't remember it. :) And yes, that function name is crap - I actually use verbose names myself, usually. I just quickly typed that function in from memory because the actual code is on another machine. I'll probably call the function clamp_to_pow2(), or something - I'm not quite as verbose as you. :P Thanks again! |
| ||
If you like puzzles try to figure out why this works.Graphics 300, 460, 0, 2 For n = 1 To 35 n2 = NextPowerOfTwo( n ) Print RSet( n, 5 ) + " " + n2 Next WaitKey Function NextPowerOfTwo( n ) n = n - 1 n = n Or ( n Shr 1 ) n = n Or ( n Shr 2 ) n = n Or ( n Shr 4 ) n = n Or ( n Shr 8 ) n = n Or ( n Shr 16 ) Return n + 1 End Function |
| ||
Simpler? No. Cuter? Yes! And probably much faster too. :) I won't spoil it for everyone else. I wish this forum supported [spoiler] tags! |
| ||
Floyd, did you come up with that yourself? My head hurts. :P |
| ||
Floyd - Your version doesn't work as advertised here. I reckon you need '|' (bitwise OR) instead of 'Or' (conditional OR). Neat trick though. I would have done it as in NextPower2(). We'd both have problems with the maximum integer value. And signed numbers are a bit iffy as well. And NextPowerV3() is another variation. Pretty ugly but it just might be faster (haven't benchmarked it). There are many ways to arrange the search tree. The optimal depends upon the expected range of values. If you have a small possible range of values a lookup table would be another way and probably the fastest so long as the number of calls is large enough to make up for the table initialisation. Note that I've changed the numbers tested from 1-35 to a randon selection of 50 in the range 1-$3FFFFFFF. Strict Print "Starting" Local iTest:Int For iTest:Int = 1 To 50 Local n:Int = Rand(1, $3FFFFFFF) Local v1:Int = NextPowerV1(n) Local v2:Int = NextPowerV2(n) Local v3:Int = NextPowerV3(n) Local msg:String If (v1 = v2) And (v1 = v3) msg = "" Else msg = "Error" EndIf Print RSet(iTest, 4) + RSet(n, 12) + RSet (v1, 12) + RSet(v2, 12) + RSet(v3, 12) + " " + msg Next Print "Finished" Function NextPowerV1:Int(n:Int) n = n - 1 n = n | ( n Shr 1 ) n = n | ( n Shr 2 ) n = n | ( n Shr 4 ) n = n | ( n Shr 8 ) n = n | ( n Shr 16 ) Return n + 1 End Function Function NextPowerV2:Int(n:Int) If n > $40000000 Throw "Value too big" EndIf Local v:Int = 1 While (v < n) v = v Shl 1 Wend Return v End Function Function NextPowerV3:Int(n:Int) If n < 1 Throw "Value too low" EndIf If n <= $10 If n <= $8 If n <= $4 If n <= $2 If n <= $1 Return $1 Else Return $2 EndIf Else Return $4 EndIf Else Return $8 EndIf Else Return $10 EndIf Else If n <= $100 If n <= $80 If n <= $40 If n <= $20 Return $20 Else Return $40 EndIf Else Return $80 EndIf Else Return $100 EndIf Else If n <= $1000 If n <= $800 If n <= $400 If n <= $200 Return $200 Else Return $400 EndIf Else Return $800 EndIf Else Return $1000 EndIf Else If n <= $10000 If n <= $8000 If n <= $4000 If n <= $2000 Return $2000 Else Return $4000 EndIf Else Return $8000 EndIf Else Return $10000 EndIf Else If n <= $100000 If n <= $80000 If n <= $40000 If n <= $20000 Return $20000 Else Return $40000 EndIf Else Return $80000 EndIf Else Return $100000 EndIf Else If n <= $1000000 If n <= $800000 If n <= $400000 If n <= $200000 Return $200000 Else Return $400000 EndIf Else Return $800000 EndIf Else Return $1000000 EndIf Else If n <= $10000000 If n <= $8000000 If n <= $4000000 If n <= $2000000 Return $2000000 Else Return $4000000 EndIf Else Return $8000000 EndIf Else Return $10000000 EndIf Else If n <= $40000000 If n <= $20000000 Return $20000000 Else Return $40000000 EndIf Else Throw "Value too big" EndIf EndIf End Function |
| ||
Your version doesn't work as advertised here. I reckon you need '|' (bitwise OR) instead of 'Or' (conditional OR). Last I checked, "here" is a Blitz3D forum: And, Or, and Xor are bitwise, and Pipe is not an operator. Seeing that there's a Strict pragma in (what I assume is) BlitzMax is making me want to switch though! I really like NextPowerV2(). It's both simple and cute. :) If you're concerned about speed, I would suspect a binary search in NextPowerV3() would prove faster - even if you want to weight it for a particular range, it'd still be faster to cut your potential answers in half each guess instead of guessing arithmetically. Personally, I would find that function much more readable if the hex notation of literals (e.g. $2000000) was replaced with "1 shl 27". I would assume either compiler would optimize away simple, constant mathematical expressions before runtime. |
| ||
Yes - I confused dialects so retract my comment about the bitwise OR. Sorry about that. Worth noting for anybody switching between the versions as the Blitz3D code ran in BlitzMax but didn't produce correct results. NextPowerV3() is doing a binary search within each hexadecimal power. So it favours the smaller numbers slightly. As I said there are many ways of ordering the tree depending on what you expect to be the most frequent range of n. I don't know whether "1 shl 27" would be replaced by a constant. Good if it was. In practice I would tend to generate code like this so human readability isn't that important for the output. Really I was just providing a couple of alternative ideas inspired by Floyd's neat but obscure trick. |
| ||
I'm under the impression that any calculation that can be made constant, is made constant. But it's been a while so I could be wrong. |
| ||
Looks like mathematical operators (including Shl) are optimized away, but builtin functions such as Log() are not. Const iterations = 10000000 Return 7 ; 203 ms Return 1 Shl 13 ; 184 ms Return Log(30) ; 885 ms Return 7.6 ; 178 ms a = 7 : b = 8 : Return a * b ; 317 ms Return iterations / 678 ; 179 ms These benchmark results were obtained on a 1.6ghz p4 with Blitz3D 1.90, debug mode disabled. Const iterations = 10000000 s = MilliSecs() : For i = 1 To iterations test_int() Next : e = MilliSecs() - s Print "test_int: " + e + "ms" s = MilliSecs() : For i = 1 To iterations test_shl() Next : e = MilliSecs() - s Print "test_shl: " + e + "ms" s = MilliSecs() : For i = 1 To iterations test_log() Next : e = MilliSecs() - s Print "test_log: " + e + "ms" s = MilliSecs() : For i = 1 To iterations test_float() Next : e = MilliSecs() - s Print "test_float: " + e + "ms" s = MilliSecs() : For i = 1 To iterations test_var() Next : e = MilliSecs() - s Print "test_var: " + e + "ms" s = MilliSecs() : For i = 1 To iterations test_constmath() Next : e = MilliSecs() - s Print "test_constmath: " + e + "ms" WaitKey() : End Function test_int#() Return 7 End Function Function test_shl#() Return 1 Shl 13 End Function Function test_log#() Return Log(30) End Function Function test_float#() Return 7.6 End Function Function test_var#() a = 7 : b = 8 Return a * b End Function Function test_constmath#() Return iterations / 678 End Function |
| ||
Added test_empty() and test_ceil() for more information. Results (Athlon XP 1800+, ~1.533GHz): test_int: 131ms test_shl: 176ms test_log: 1106ms test_float: 150ms test_var: 163ms test_constmath: 160ms test_empty: 123ms test_ceil: 1478ms I agree with you that the internal math functions are not replaced by const values. However after subtracting the loop and call overhead shown by test_empty(), I believe that there is significant variation in the other functions compared to test_int. test_int: 8ms test_shl: 53ms test_log: 983ms test_float: 27ms test_var: 40ms test_constmath: 37ms test_empty: 0ms (this is the baseline) test_ceil: 1355ms From this I'd say that expressions like '1 Shl 13' are not reduced to a constant. I'll think about a more conclusive test. Const iterations = 10000000 s = MilliSecs() : For i = 1 To iterations test_int() Next : e = MilliSecs() - s Print "test_int: " + e + "ms" s = MilliSecs() : For i = 1 To iterations test_shl() Next : e = MilliSecs() - s Print "test_shl: " + e + "ms" s = MilliSecs() : For i = 1 To iterations test_log() Next : e = MilliSecs() - s Print "test_log: " + e + "ms" s = MilliSecs() : For i = 1 To iterations test_float() Next : e = MilliSecs() - s Print "test_float: " + e + "ms" s = MilliSecs() : For i = 1 To iterations test_var() Next : e = MilliSecs() - s Print "test_var: " + e + "ms" s = MilliSecs() : For i = 1 To iterations test_constmath() Next : e = MilliSecs() - s Print "test_constmath: " + e + "ms" s = MilliSecs() : For i = 1 To iterations test_empty() Next : e = MilliSecs() - s Print "test_empty: " + e + "ms" s = MilliSecs() : For i = 1 To iterations test_ceil() Next : e = MilliSecs() - s Print "test_ceil: " + e + "ms" WaitKey() : End Function test_int#() Return 7 End Function Function test_shl#() Return 1 Shl 13 End Function Function test_log#() Return Log(30) End Function Function test_float#() Return 7.6 End Function Function test_var#() a = 7 : b = 8 Return a * b End Function Function test_constmath#() Return iterations / 678 End Function Function test_empty#() End Function Function test_ceil#() Return Ceil(3) End Function |
| ||
NextPowerV3() is doing a binary search within each hexadecimal power. No, it's not. A binary search (see the Wikipedia page) divides and conquers half (hence the word binary) of the possible answers each test. Your function is narrowing things down sequentially into eight ranges (what you call hexadecimal powers), then testing sequentially within. |
| ||
However after subtracting the loop and call overhead shown by test_empty(), I believe that there is significant variation in the other functions compared to test_int. Keep in mind that there's going to be a certain percentage of error involved testing in a multitasking operating system. Since your machine is much faster than mine, you might want to increase your iterations to help compensate for this. I found that whatever test I was running first was registering as slower than the rest. Adding Delay(1000) to the start of the program solved this for me. This didn't seem to be a problem for your first test's result, but I recommend you try this to see if it effects your other results. When I run your code here, I see consistantly similar times for int, shl, float, constmath, and empty. The range was 1762ms to 1790ms. What version of Blitz are you using? You aren't testing this with BlitzMax, are you? |
| ||
I'll maintain that it is a binary search within each power of 16. Consider that the higher half of the possible numbers in the range will be found on the first test, half the remainder on the second test, half the remainder on the third test etc. Sounds like a binary search to me. You are right that the search tree could be more even ala: Function NextPowerV4:Int(n:Int) If n <= $8000 If n <= $80 If n <= $8 If n <= $2 If n <= $1 Return $1 Else Return $2 EndIf Else If n <= $4 Return $4 Else Return $8 EndIf EndIf Else If n <= $20 If n <= $10 Return $10 Else Return $20 EndIf Else If n <= $40 Return $40 Else Return $80 EndIf EndIf EndIf Else If n <= $800 If n <= $200 If n <= $100 Return $100 Else Return $200 EndIf Else If n <= $400 Return $400 Else Return $800 EndIf EndIf Else If n <= $2000 If n <= $1000 Return $1000 Else Return $2000 EndIf Else If n <= $4000 Return $4000 Else Return $8000 EndIf EndIf EndIf EndIf Else If n <= $8000000 If n <= $80000 If n <= $20000 If n <= $10000 Return $10000 Else Return $20000 EndIf Else If n <= $40000 Return $40000 Else Return $80000 EndIf EndIf Else If n <= $200000 If n <= $100000 Return $100000 Else Return $200000 EndIf Else If n <= $400000 Return $400000 Else Return $800000 EndIf EndIf EndIf Else If n <= $8000000 If n <= $2000000 If n <= $1000000 Return $1000000 Else Return $2000000 EndIf Else If n <= $4000000 Return $4000000 Else Return $8000000 EndIf EndIf Else If n <= $20000000 If n <= $10000000 Return $10000000 Else Return $20000000 EndIf Else If n <= $40000000 Return $40000000 Else Throw "Value too big" EndIf EndIf EndIf EndIf EndIf End Function This is the evenly balanced search tree providing an answer in 5 comparisons for all cases. Like I said there are many ways to arrange the search tree depending on how well you know the input numbers and how finicky you want to be. If the input numbers are uniformly distributed over 0 - 2^30 then the best tree would be: Function NextPowerV5:Int(n:Int) If n > $40000000 Throw "Value too big" ElseIf n > $20000000 Return $40000000 ElseIf n > $10000000 Return $20000000 ElseIf n > $8000000 Return $10000000 ElseIf n > $4000000 Return $8000000 ElseIf n > $2000000 Return $4000000 ElseIf n > $1000000 Return $2000000 ElseIf n > $800000 Return $1000000 ElseIf n > $400000 Return $800000 ElseIf n > $200000 Return $400000 ElseIf n > $100000 Return $200000 ElseIf n > $80000 Return $100000 ElseIf n > $40000 Return $80000 ElseIf n > $20000 Return $40000 ElseIf n > $10000 Return $20000 ElseIf n > $8000 Return $10000 ElseIf n > $4000 Return $8000 ElseIf n > $2000 Return $4000 ElseIf n > $1000 Return $2000 ElseIf n > $800 Return $1000 ElseIf n > $400 Return $800 ElseIf n > $200 Return $400 ElseIf n > $100 Return $200 ElseIf n > $80 Return $100 ElseIf n > $40 Return $80 ElseIf n > $20 Return $40 ElseIf n > $10 Return $20 ElseIf n > $8 Return $10 ElseIf n > $4 Return $8 ElseIf n > $2 Return $4 ElseIf n > $1 Return $2 ElseIf n > $0 Return $1 Else Throw "Value too low" EndIf End Function That gives a lower average number of comparisons but the worst case is of course much higher. On the timings - My testing is with Blitz2D since I don't have Blitz3D and I thought that they would be closer for basic calculations like this than BlitzMax. Given the difference between your latest figures and mine perhaps they are more different than I thought. Performance testing always seems to degrade into comparing the machines and language versions rather than finding out anything general and reliable about how to code for efficiency. In almost all cases I've found that Blitz performance is fantastic for the projects that I have. Now with fast Alpha, 2D image rotation and other drawing speed improvements in BlitzMax everything is really good. |
| ||
Yes, in the case of a perfectly random distribution of values from 0 to 2^32-1 your second example would be faster on average than a binary search. However, it is not a binary search: Wikipedia says: Binary search is a logarithmic algorithm and executes in O(log n) time. Specifically, 1 + log2N iterations are needed to return an answer. I believe the point you're missing is that a binary search divides the answer set, not the problem set. If we don't know the distribution, it's better to go for a better worst case scenario. Personally, all the values I'll be passing to this function (texture widths and heights) will be much lower than 67108864 - at which point a linear search becomes slower than a binary search. I can't think of any practical uses for this function which would use a number that large. |
| ||
I believe the point you're missing is that a binary search divides the answer set, not the problem set. Who says ? Sounds like a statement that you just made up. I disprove your statement with an example. To perform a binary search of say the telephone directory, I open it to the middle and see if the name I am searching for is <= or > that position. Then I search that half etc. This is an example of a binary search. It is dividing the problem set, not the answer set. Here 'n' is the number of entries in the directory and the search is O(log n). Compare this to the rather ridiculous prospect of searching the telephone directory entry by entry from the beginning to the point where you find a match. That would be the 'linear' search approach and performance would be O(n). Now change the telephone directory to the set of possible input numbers to our function. In the correct interpretation, 'n' is the number of possible numbers - lets say it is 2^32 although this isn't really correct as the valid range is really 2^30 once you take out negative numbers and those > 2^30 for which the answer 2^31 wraps around in a signed integer. All of my examples are O(log n) or better. NextPowerV4() appears to do considerably better. As does Floyd's NextPowerOfTwo(). It is possible to do better than O(log n) because of course this isn't a general search but something open to calculation. Both of these routines achieve something like O(log(log n)) for best, worst and average cases which is quite impressive. I stand by my statement about NextPowerV5() that "If the input numbers are uniformly distributed over 0 - 2^30 then the best tree would be:". To be perfectly correct I should have said "If the input numbers are uniformly distributed over 0 - 2^30 then the best tree for the best and average case would be:". In fact for the average case, NextPowerV5() performance is constant time - O(1) and that beats both O(log n) and O(log(log n)). If you don't get that I'd be happy to prove it. You might argue that 'n' should refer to the number of bits of the function argument. By doing so you've already applied the log function. If that is your interpretation then I'd encourage you to use 'n' in the conventional way. For a function F with input p we write F(p) and discuss its performance in terms of being O(1), O(n), O(log n), O(n^2) etc with 'n' being the number of possible values of 'p'. For simplicity you can do away with p and just talk about F(n) performance being O(n) etc. As an aside - It would be possible but of course ridiculous to try to implement the function as a linear search with O(n) performance. If we don't know the distribution, it's better to go for a better worst case scenario. Another bold statement. It depends on what you want to optimise, the average case or the worst case. What you are stating is that if you don't know the input distribution you should optimise for the worst case. Well maybe not. Maybe you should optimise the average performance. It depends upon the application. Going back to the original question by big10p we do have an idea about the input distribution - "... given a random integer value ...". |
| ||
In the correct interpretation, 'n' is the number of possible [inputs] Ahh. That makes sense. For some reason I always thought of n as the number of possible outputs. Thanks for setting me straight. I guess V5 is a binary search! :) Another bold statement. That was an aside: as I already admitted, if we know that the distribution will be perfectly random, we'd be better served by V5 than V4. If we don't know the distribution of inputs or the performance expectations, I maintain that it's best to offer reasonable and similar performance for all inputs. This cuts down on difficult to find performance degradation in production code. Optimization can come later (after profiling), if it's necessary at all. In fact for the average case, NextPowerV5() performance is constant time - O(1) Now you're just being silly. |
| ||
Finally - Our thoughts are as one. I even agree with your words on optimising later if at all and certainly not without profiling. Now you're just being silly. No. It really is O(1). Are you a betting person ? Warning - check what you understand O(1) means before accepting a bet. |