little Benchmark
BlitzMax Forums/BlitzMax Programming/little Benchmark
| ||
Hi, maybe someone can help me out here. I did a small research testing the speed of different programming languages. It's only a very small math test, but it's results surprised me a little. This is the Blitzmax version, coded very quickly and dirty Global Anzahl, Iterationenl, EndTime, StartTime,i : Long Global Sekunden, Minuten, Stunden, DauerMsec,x,y : Double Anzahl = Long ( Input("Anzahl Clients: ")) Iterationen = ((Anzahl-1)*Anzahl/2) EndTime = MilliSecs() For i = 0 To Iterationen y = Sin(i) x = Cos(y) x = x*y Next StartTime = MilliSecs() Print "Anzahl: " + String(Anzahl) Print "Iterationen: " + String(Iterationen) DauerMsec = StartTime - EndTime Print "Millisekunden: " + String.FromDouble(DauerMsec) Print "Sekunden: " + String.FromDouble(DauerMsec/1000) When I enter 10000 then on my older pc it takes 16 seconds. Now it comes, here are the comparisons: Delphi: 8,6 Seconds (first run 10secs) Java 1.6: 1 Min 40 Seconds!! hotspot was on I think c# : 9 Seconds purebasic: 8 seconds Blitzmax: 16 seconds And that's what I don't understand, my guess would have been that pb and BMax should have been equal. On a side not but irrelevant for the results, I created a nice form for delphi,Java and c#. And yes, the debugger was deactivated. And please, this little benchmark is only a little test for me to check out the math speed and the implementation in different languages and has nothing to do with the practical speed of a whole application., so no flames about better languages or so. I only want to know why BMax needs 16 seconds where pb only needs 8. Where can I optimize. I used that same formula in every other language. Best regards B. |
| ||
Might not have anything to do with the speed but you realise you're only making the last variable on the Global lists Long and Double? Are you also using PureBasics quad type which would be equal to BlitzMaxs Long? |
| ||
Is there any reason for the X variable to be an integer? It works faster when it is declared as Double, as the Y variable, this way, the test makes much more sense to me. I've converted the code to Visual Basic .net and it took 9 seconds with the debug on, and 93 milliseconds running with the program built and no debuger present. So I supose you tested C# from the visual studio IDE over a debug build, try it over a release build directly from windows, and I supose you'll get a much better performance. Sample code (the program has a form with a button and a textbox) Public Class Form1 Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click Dim Anzahl As Integer, Iterationen As Integer, EndTime As Long Dim StartTime As Long Dim i As Long Dim DauerMsec As DateTime Dim x As Double, y As Double Anzahl = 10000 Iterationen = ((Anzahl - 1) * Anzahl / 2) EndTime = Now.Ticks For i = 0 To Iterationen y = Math.Sin(i) x = Math.Cos(y) x = x * y Next StartTime = Now.Ticks DauerMsec = New DateTime(StartTime - EndTime) SPrint("Anzahl: " & Anzahl) SPrint("Iterationen: " & Iterationen) SPrint("Time: " & DauerMsec.ToString & " " & DauerMsec.Millisecond) End Sub Sub SPrint(ByVal Str As String) Me.TextBox1.AppendText(Str & Chr(13) & Chr(10)) End Sub End Class (I thought .net would be slower but oh surprise!) |
| ||
Setting SuperStrict and declaring the variables reduced my initial 10s to 7s. I aso saved a couple of secs if 'i' was declared as int. |
| ||
8 secs here without any optimisation of your example code given. If one codes dirty one can't expect awesome results. |
| ||
Why should declaring them matter? If it assumes that they are integers anyway, as far as the exe is concerned it's the same? |
| ||
Superstrict seems to activate some optimizations in the generated exe, but it forces to explicity declare any variable. That makes your code easier to debug and mantain. I can't see any reason not to use it. |
| ||
sorry guys but you can't compare the time because you don't have the comparison codes. So, if I would run my code on my new pc it would probably only take 2 secs or less. These are my testing specs: winxppro sv2 Pentium 4/2.67 GHz and 1GB Ram @ziggy: for c# or vb you have to declare and define AND actually use these variables or else it will get optimized, meaning it won't calculate at all. I got the same result as you the first time, then I added an editfield which showed me the result of x after the loop. C# was the only language optimizing in such a way. Edit: Why are x,y integers? They are doubles at least should be -> Azathoth @Azathoth: true, changed it, now it takes 7,54 secs with pb.... and yes, hopping to so many languages let me tripple about it, in BMax I have to declare each type separately ;) thanks for pointing this out. It's only in BMax that way. Now it takes only 11,67 secs. Here the new code. SuperStrict Global Anzahl : Long,Iterationen : Long,EndTime : Long,StartTime : Long,i : Long Global DauerMsec: Double,x : Double,y : Double Anzahl = Long ( Input("Anzahl Clients: ")) Iterationen = ((Anzahl-1)*Anzahl/2) EndTime = MilliSecs() For i = 0 To Iterationen y = Sin(i) x = Cos(y) x = x*y Next StartTime = MilliSecs() Print "Anzahl: " + String(Anzahl) Print "Iterationen: " + String(Iterationen) DauerMsec = StartTime - EndTime Print "Millisekunden: " + String.FromDouble(DauerMsec) Print "Sekunden: " + String.FromDouble(DauerMsec/1000) @tonyg: i must be as big as iterations @Grisu: helpful So BlitzMax is getting nearly as fast as c#. Now I would like to know why. Just to understand the 'under the hood'. Always eager to learn... (I still would have bet that pb and BMax would be equally fast in math) |
| ||
I probably don't have to ask this but since you didn't explicitly say it... you did compile in release mode, correct? |
| ||
the reason BM is slower is the outdate MingW it currently builds on which is 3+ years old. (P2 optimations only!) If you replace it with a newer mingw, enable the correct compiler flags (P3 at least) and rebuild the modules, the calculation speed will raise by 50-150% depending on the operation. |
| ||
@tonyg: i must be as big as iterations I changed iterations as well. You're right our times are irrelevant which is why I focused on the time saved. Declaring variables is a *good* thing. It'd be interesting to see where each program spends its time in case it's a specific command in Bmax which takes the extra 1-2s. Having said that I, personally, wouldn't worry about a 3s difference or would factor it in with other reasons to use a language. |
| ||
...enable the correct compiler flags (P3 at least) How exactly do you do that? I have the latest MinGW running (works with everything other than MaxGUI) but have no idea how to enable different compiler options. |
| ||
How exactly do you do that? According to "man gcc" (haw)... -march=pentium3 |
| ||
I meant: how/where do you pass *any* flags to MinGW? (I've just used the 'Build Modules' through the IDE so far) Also -- using the Pentium3 flag, does that optimize the execution speed for Pentium 3 and above, or does it render the resulting program completely unusable on a Pentium 1 or 2? |
| ||
@Dreamora - have you tried this? As I understand it, Max does its own compilation, spitting out assembly that is compiled by FASM. I guess sin, cos in this example may be C functions, compiled by MingW? Anyone know? |
| ||
All C or C++ code in the modules is compiled using MinGW, this will apply not only to Sin, Cos, etc. but also to the GC, and some other areas. Using a not officially suported version of MinGW may have side effects (maxgui is not compilable with current latest release) but it seems to improve programs execution speed and it reduces drastically calculation rounding errors. |
| ||
Great news, I read somewhere that Mark wanted to replace the old MingW with the new one. Don't know how long it'l take. But besides that it's still remarkable that c# is so fast. Btw Delphi as I read hasn't got a compiler update for a long time too. Question, does c# when doing a JIT, compile for the special cpu it's running on, or does it 'only' have a standard compiler for all cpu? On a side note, PB seems to have a little bug with quad types. In my example (not a bug, but a strange missing feature) I can't use quads for For loops, only longs (4byte). Anybody uses Java here? I have a strange feeling about the result, I mean 1min40??? crazy. There's got to be a way to optimize it to something comparable to the other results! |
| ||
java wastes a LOT of time if you give it a lot to print out. it works much faster if you only print out at the end. |
| ||
that's what I did,debugger out too. Will have to investigate further. |
| ||
Yeah, Mark said that the next blitzMax update will work with the latest MinGW... that doesn't say anything about what additional optimizations such as the 'pentium3' flag may or may not be enabled by default. :-? |
| ||
Change your globals to all locals and see it fly. When you use Global you explicitly are forcing the variables to be stored in memory rather than be optimized into local registers. |
| ||
11.58 secs - no fly- hard landing ;) |
| ||
Is it faster with this code:SuperStrict Import "sinandcos.c" Extern Function Sin:Double(v:Double) Function Cos:Double(v:Double) End Extern Global Anzahl : Long,Iterationen : Long,EndTime : Long,StartTime : Long,i : Long Global DauerMsec: Double,x : Double,y : Double Anzahl = Long ( Input("Anzahl Clients: ")) Iterationen = ((Anzahl-1)*Anzahl/2) EndTime = MilliSecs() For i = 0 To Iterationen y = Sin(i) x = Cos(y) x = x*y Next StartTime = MilliSecs() Print "Anzahl: " + String(Anzahl) Print "Iterationen: " + String(Iterationen) DauerMsec = StartTime - EndTime Print "Millisekunden: " + String.FromDouble(DauerMsec) Print "Sekunden: " + String.FromDouble(DauerMsec/1000) sinandcos.c source: #import <math.h> double Sin(double v) { return sin(v); } double Cos(double v) { return cos(v); } It seems to be much faster on my computer this way. |
| ||
Great idea ziggy! I gave it a test and it took 10.069 seconds, so it actually is a little faster. Don't know what's different with Marks code, shouldn't he use the math lib too? But due to the same MingW it can't really be the big speed boost. I'm very eager on trying this with the newest version of MingW. Hmm shouldn't it be possible to try this little source out with the new MingW ? Chances are it could even work... Edit: I installed the newest MingW but didn't do a complete rebuild but did use ziggys version so the .c code should have normally been compiled with the new version. Helas, there wasn't any speed change. I'm afraid of rebuilding all modules, would that change anything? |
| ||
Got a new optimization :) the variable 'Iterationen' is causing the huge time increase. Seems that in BMax or let's better say MingW, the values are not clamped. I introduced a new variable as j:byte and in the loop added a j=i. Now I have a speed increase to 9 seconds. Having a sin(xx millions) is not very realistic ;) oh and sin() is slower than cos(). |
| ||
Yes, it will make better rounding when dealing with decimal precission numbers, and a little (and sometimes not so little) speed increase. My Gess is that it is faster becouse C function is not managed code, and the internal variables of the function don't have to deal with the GC. I'm not much sure about this, but it seems logical to me. |
| ||
just a little sidenote, I managed to drop the Java run time from 90 seconds to 9!!! seconds by clamping the values between -PI/4 and +PI/4. The same test ist BLitzMAx didn't result in any speed increase, stays at 10,6 seconds. Funny ey?? somebody from a Java forum tried values too and came up with a funny result, to keep the speed high, you have to keep away from x^10 for sin and cos. Didn't try this though but very interesting.SuperStrict Local Anzahl : Long,Iterationen : Long,EndTime : Long,StartTime : Long,i : Long Local DauerMsec: Double,x : Double,y : Double Local jo1:Double = -1*Pi/4 Local jo2:Double = Pi/4 Anzahl = Long ( Input("Anzahl Clients: ")) Iterationen = ((Anzahl-1)*Anzahl/2) Local jo3:Double = 2*jo2/Iterationen; Local j:Double=jo1; EndTime = MilliSecs() For i = 0 To Iterationen y = Sin(j) x = Cos(j) x = x*y j=j+jo3; Next StartTime = MilliSecs() Print "Anzahl: " + String(Anzahl) Print "Iterationen: " + String(Iterationen) DauerMsec = StartTime - EndTime Print "Millisekunden: " + String.FromDouble(DauerMsec) Print "Sekunden: " + String.FromDouble(DauerMsec/1000) |
| ||
So blitzMax is slower than Java in this test? That is shocking. |
| ||
Not so shocking, only in this particular case. If you take my first code with huge values for i Java will crawl again, but then no one will ever need such high values for sin and cos ;) It's all about optimizing, but I think nowadays nearly every language is fast enough it's more a 'what language is the most usefull for this project?'. That's why I used Blitz3D for an architectural walkthrough but will probably use Processing for my next project which will need good networking stuff, no fullscreen, easy 3d, but lots of math and easy connections between many clients and a server and easy plattform portabiliy. I may eventually change later maybe, but for rapid prototyping Processing is fantastic. It's like and nice and easy framework with all Java power underneath if you need it. And working pure OOP is always beneficial. But would I do a full game with Processing or Java? no, then I would personally use BlitzMax probably. |
| ||
Whats preventing you from using Java ... JRenderMonkey and the like are good alternatives for doing 3D in java if you are happy enough with it :) |
| ||
complexity is preventing me from using Java, that's why Processing is so great. And BlitzMax is great too, not for every case, but for indie games it's perfect, I like c# too, it's cool for GUI intensive apps, Delphi is very nice too if you're aiming for an all in one exe without .net, Blitz3D is cool too, for doing nice 3D quick and easy. And not to forget GameMaker ;) yes I really like it for some fun stuff. I like having choices, don't you too?? And as speed is not such a decision factor anymore one is free to jump around. |
| ||
why dont you disassemble the executable code from each test, its the resulting executing machine code that matters. |
| ||
Java has a virtual machine, so no exe, like c#, only bytecode. The speed difference isn't so big anymore, it's just that Java has very optimized native code for some sin/cos ranges (-PI/4 to PI/4 and has big problems with x^10 values. PB seems to have it's own libs thus being the fastest and BlitzMax relies on the MinGW libs (I think). But speed is ok for all languages in most cases. Disassembling would be cool and informative for the exe versions though, but I'm not an assembler geek, so I'll leave this to Mark and co ;) |
| ||
You realize of course that the problem with this kind of "benchmark" is that the iteration loop is using up a non-trivial amount of the overall time, and it's prone to compiler cheats. Try a Whetstone test, or if you want a more direct comparison, I believe that Jim Brown has a collection of Sieve of Eratosthenes written in various languages to offer a better comparison. |
| ||
well I know that my little benchmark is more a private and very unofficial thing. It was pure curiosity that let me translate it from delphi to some other languages and I wanted to know if I would be right with my guesses. I was wrong in many cases, shocked by the first implementation in Java, surprised by the speed in c#, BlitzMax let me wounder, being the 'slowest' one, but then it's still fast enough. I think that's even my resumee, in most cases all (tested) languages are fast enough. An implementation in c++ would be funny though, but I'm no friend of c++ ;) and I don't like to install another MS express. My guess is, the result would be between 7.5 and 11 seconds ;) |