Odd results in performance test...
BlitzMax Forums/BlitzMax Programming/Odd results in performance test...
| ||
While testing bitwise operations (not in THIS test), I happened upon an odd result. Here, I am simply assigning two values to two other values over and over and timing the results. The second batch, on my system, is taking almost twice as long as the first batch, even though it's doing the same thing. Does anyone know why this happens (or are your results different)?Framework brl.standardio Type ta Field fb:Byte Field fs:Short End Type Local a:ta = New ta a.fb = 48 a.fs = 25001 Local b1:Byte, s1:Short Local b2:Byte, s2:Short Local ms1, ms2, ms3 Local iters = 1000000000 'GCSuspend() ms1 = MilliSecs() For i = 1 To iters b1 = a.fb s1 = a.fs Next ms2 = MilliSecs() For i = 1 To iters b2 = a.fb s2 = a.fs Next ms3 = MilliSecs() 'GCResume() Print ms2 - ms1 Print ms3 - ms2 |
| ||
Interesting results here too, on OS X 32-bit, release build, averages over 10 runs each, and code tweaked to SuperStrict (NG requires at least Strict level). Legacy BlitzMax : 693.8 677.6 BMX NG : 333.5 341.4 Of course, it's mostly a pretty meaningless test ;-p |
| ||
Well, those are the results I would expect. And if I got them, I'd think it was meaningless, too. But my results (on my old machine): 1217 2199 That seems strange, to me. Could something during initialization be messing with my results? |
| ||
Added some code to run the test over and over and averaged the results - and they are still strange (1204.3 and 2266.7). So, I've ruled out the program's initialization. I agree that this test is, by itself, not meaningful, but it's the testing procedure that I am questioning. |
| ||
No idea why your results appear so skewed. Do you get different results with this? |
| ||
No significant difference for me (referring to original test), i7 CPU and 64-bit Windows 7. I'll hazard a guess that the issue is code alignment, which for some reason is affecting the old machine. Try putting dummy code in various places, to move other code in memory, something like the following. Note I added a Print at the end to use the nothing value. That's in case BlitzMax takes the liberty of pruning code which can never affect results. Framework brl.standardio Type ta Field fb:Byte Field fs:Short End Type Local a:ta = New ta a.fb = 48 a.fs = 25001 Local b1:Byte, s1:Short Local b2:Byte, s2:Short Local ms1, ms2, ms3 Local iters = 1000000000 'GCSuspend() nothing:Int = 0 ' This will move following code slightly. ms1 = MilliSecs() For i = 1 To iters b1 = a.fb s1 = a.fs Next ms2 = MilliSecs() For i = 1 To iters b2 = a.fb s2 = a.fs Next ms3 = MilliSecs() 'GCResume() Print ms2 - ms1 Print ms3 - ms2 Print "The nothing value, just so it gets used, = " + nothing |
| ||
Hey, thanks for your replies. I tried both versions and the results are the same. I am ... confuzzled. :/ Brucey, on a semi-related question, is there a way to get access to the GC's reference count? |
| ||
why not loop the results totalling as you go and see then:superstrict Type ta Field fb:Byte Field fs:Short End Type Local a:ta = New ta a.fb = 48 a.fs = 25001 Local b1:Byte, s1:Short Local b2:Byte, s2:Short Local ms1:Int Local ms2:Int Local ms3:int Local iters:Int = 1000000000 Local i:Int Local c1:Int Local c2:int 'GCSuspend() Local k:Int For k = 1 To 50 ms1 = MilliSecs() For i = 1 To iters b1 = a.fb s1 = a.fs Next ms2 = MilliSecs() For i = 1 To iters b2 = a.fb s2 = a.fs Next ms3 = MilliSecs() 'GCResume() c1 :+ (ms2 - ms1) c2 :+ (ms3 - ms2) Print k+" a="+(ms2 - ms1)+" b="+(ms3 - ms2) Next Print Print "total a="+c1+" b="+c2 Print "average a="+(c1/50)+" b="+(c2/50) This gives you a muck better idea about speed |
| ||
on a semi-related question, is there a way to get access to the GC's reference count? On legacy Blitzmax, it should be at (Object address - 4 bytes) |
| ||
On my linux box the release build took 1038 1036 or in later runs 1036 1036 So I assume it is "like it should". Debug builds for Vanilla and NG took too long, I "stopped execution" before results came in (>30sec). With NG I get 693 690 which is more than the shapeoff Brucey got (maybe his "vanilla" result isnt that efficient on Macs ?). BTW: Superstrict @AdamStrange Your code in Vanilla: 1 a=1038 b=1058 2 a=1035 b=1379 3 a=1035 b=1061 4 a=1035 b=1127 5 a=1036 b=1113 6 a=1035 b=1176 7 a=1035 b=1092 8 a=1035 b=1059 9 a=1035 b=1053 10 a=1036 b=1099 Your code in NG (odd results): 1 a=1064 b=1295 2 a=741 b=854 3 a=737 b=917 4 a=733 b=1052 5 a=1102 b=1199 6 a=1008 b=820 7 a=1202 b=691 8 a=1045 b=696 9 a=1052 b=693 10 a=1175 b=708 @Brucey, why these hickups - is this some kind of GC kicking in differently to vanilla? bye Ron |
| ||
On my laptop (HP ProBook 6550b) running Windows 7, 64bit: Normal: a=1279, b=1271 Debug: a=18241, b=18138 MT: a=1275, b=1275 MT+debug: a=31518, b=32583 Same laptop running XUbuntu 12.04, 64bit: Normal: a=1264, b=1259 Debug: a=18695, b=19152 MT: a=1341, b=1332 MT+Debug: a=29724, b=29454 I was a little surprised by MT+Debug being so much slower! Si... |
| ||
On legacy Blitzmax, it should be at (Object address - 4 bytes) Tried it, but noticed a thread where you recommended against doing just that. Also, I found that strings (and ONLY strings) reported 1 less than I would expect. I had a reference to a string and the result was 0.Thanks, everyone, for testing! |
| ||
Maybe something to do with caching? |
| ||
Don't think so. If that was the case, I'd expect the outcome to be the same if I swapped the order. |
| ||
Bringing this up again because something occurred to me and I had to test it. Perhaps the results of this test would be different based on my IDE and sure enough! /shaking head Blide result: 1208 / 2009 MaxIDE result: 1208 / 1210 Whaaat!? |
| ||
Having fun? :-) On my new Skylake i5 machine (OS X), I get : Legacy, release, 32-bit: 588 598 Current NG, release, 64-bit / 32-bit (they come out about the same each run) : 287 287 I expect the compiler (eg. GCC) to automatically sort out the field alignments for the generated C struct in NG. No idea if this is also the case with the legacy compiler. SizeOf(a) reports 4 for both legacy and 32-bit NG. 8 bytes for 64-bit, as you'd expect. <edit> Based on the offsets, it looks like legacy bcc is also aligning the short : movb $48,8(%ebx) movw $25001,10(%ebx) |
| ||
Oh, I wasn't testing compilers. I was testing IDEs this time. It seems that Blide is doing something a bit differently. |
| ||
IDEs? Your code is compiled into machine code. The only interaction your IDE has with your running program (in release mode) is sucking in the piped output from it. And as per AdamStrange's example, your figures will be more normalised if you run them several times in a loop. |
| ||
Did that ages ago, Brucey, and the results were the same. Yes, I know that code is compiled into machine code, but that doesn't explain why I'd be getting different results depending on the IDE...that's the mystery. EDIT: I know you think it's just user error, but if you have Blide - try it... I replaced the last Print with Notify (adding Import brl.system at the top, of course) and created .exes using both IDEs. The exe that came from Blide gives different results than the one that came from MaxIDE. It's as if Blide was doing something extra... YET the exe sizes are the same. It's just weird. At first, I thought maybe it was interference from my virus checker or something, but now I don't know. |
| ||
I wonder if Blide is sending the same flags to the bcc compiler as MaxIDE does. It might be why the difference. |
| ||
if Blide is sending the same flags to the bcc compiler as MaxIDE Good call...that could be it! |
| ||
You could always diff the generated .s files (in the .bmx folder) created by each build and see if there's anything different there. With the same version of bcc you'd expect to see the same generated code. |
| ||
Great suggestion, and...they are slightly different. The Blide .s output has two extra lines... extrn _bbEnd and call _bbEnd Interesting... :-) |
| ||
And otherwise they are identical? Are you compiling the *same* source file with the *same* BlitzMax instance? |
| ||
Yes, and yes. Your question leads me to believe you don't think _bbEnd should do anything..? |
| ||
Remember that the options aren't just for the benefit of bcc. Some options will be passed off to the assembler and linker as well. You may have identical .s files, but completely different .a files if you are using different optimization settings in FASM. |
| ||
There are no settings passed to FASM on Windows. |