BlitzMax vs C/C++ speed comparision

BlitzMax Forums/BlitzMax Programming/BlitzMax vs C/C++ speed comparision

Mahan(Posted 2011) [#1]
IMPORTANT NOTICE:
This thread is not intended to start a flame war (read below)!

I made a statement yesterday in another thread that BlitzMax was running at about half the speed of pure C. (http://blitzmax.com/Community/posts.php?topic=93583). NB: This was at the time an unsupported claim.

Sadly this seemed to offend some people. I however think that we as mature programmers and/or computer scientists should be able to discuss this matter in a normal manner, without name calling but purely based on assumptions and facts/observations.

So the purpose of this thread is to compare C/C++ with BlitzMax runtime speed.

I will post a little test I've done and I'd be happy if you can help me optimize either version of the programs posted, or if you'd like to submit more/other tests in C/C++ source side-by-side with BMX source.

I'm also very open if you want to point out flaws in my test. If you do you're more than welcome to submit an updated/bugfixed version.

The compilers:

1. BlitzMax 1.41
2. Visual C++ 2010 (from Visual Studio Express 2010 which is free for anybody, so you can download and test this on your own if you like)


The sources:

1. BMX:



2. C/C++:




Details:

The programs operate on a 12Mb heap-chunk on purpose to not fit in the cache of most CPUs.


The results:

Both programs where run twice with the result posted taken from the second run. (to minimize factors like CPU-speedstep etc.)

BlitzMax program was build in release mode.
C++ program was build in release mode with full optimizations enabled.


BlitzMax output:

Iterations took 94.000000000000000 seconds

C++ output:

Iterations took 34.405 seconds


Comments:

This is pretty much the difference i expected.

BTW: I have not tailored the programs for either language to look good or bad. I just wrote what came into my mind. If this is biased in some way, you are very welcome (as mentioned before) to post other tests or even point out for me where you think I'm unfair.


And please refrain from talking about productivity and speed of graphics or libraries in the comments, unless it's fully related to the raw speed of the CPU code generated in either compiler.

EDIT: Test computer was my G73-JH. CPU: Core i7 740 (1.6-2.93GHz), 6Mb Cache and 10Gb system RAM.

Last edited 2011


slenkar(Posted 2011) [#2]
Interesting discussion..

Im curious about the performance of MingW too, could you a test for that?

it would be interesting to see a comparison of the frames per second of 2 simple games, but that is outside the scope of the discussion I spose

Last edited 2011


Mahan(Posted 2011) [#3]
Sure here it is:

MinGW G++ compiler version: g++ (GCC) 3.4.5 (mingw-vista special r3)

Optimizations (command line): g++ -O3 speedTest.cpp

speedTest.cpp is the same C++ source as above.


Result:

G++ (Gnu C++ compiler in MinGW):

Iterations took 27.386 seconds


Same computer and result from second run.


slenkar(Posted 2011) [#4]
faster than MS?? I didnt expect that he eh eh


Jesse(Posted 2011) [#5]
for a more fair comparison can you try this on your computer:
Const DIM_SIZE:Int = 3000000  ' Array dimension 3 million ints i.e.12Mb on 32 - bit (idea is To get cache misses)
'int* dim = (int*)malloc(DIM_SIZE * sizeof(int));
Local bank:TBank = CreateBank(DIM_SIZE*4)
Local bankPtr:Int Ptr = Int Ptr(BankBuf(bank))
' init array with some random junk.
For Local i:Int = 0 Until DIM_SIZE
	bankPtr[i] = Rand(0, 2147483647)
Next


Local t1:Double = MilliSecs()

For y = 0 Until DIM_SIZE Step 1000

	For Local z:Int = 0 To (DIM_SIZE - 1)
		If (y <> z)
			bankPtr[z]:+ 1
			bankPtr[y]:& bankPtr[z]
			bankPtr[z]:- 1
			bankPtr[z]:~ bankPtr[y]
			bankPtr[z]:~ $ffffffff ' Is there a binary not operator in BMX?
		End If
	Next

Next

Local t2:Double = MilliSecs()

Print "Iterations took " + Double((t2 - t1) / 1000) + " seconds"

Input()


Last edited 2011


Mahan(Posted 2011) [#6]
I'm most impressed!

New BMX run with Jesse's update:

Iterations took 36.524999999999999 seconds

Results from second run as usual.

Also fixed 2 forgotten "dim"-references in the middle of the loop to bankPtr[]


xlsior(Posted 2011) [#7]
Something else I'm curious about: How does blitzmax perform when used with MinGW 4.5?

Do you have a .exe of the C code, so we can compare it on our own computers?


Jesse(Posted 2011) [#8]
oops! got careless. original code updated!
also I believe that for loops are faster than while loops(per Mark in an old post that I can't find).


Mahan(Posted 2011) [#9]
New readout for your latest version (run #2):

36.348999999999997 seconds

The outer While/For loop only runs ~3000 times, so that's probably why it's not doing much.


Jesse(Posted 2011) [#10]

The outer While/For loop only runs ~3000 times, so that's probably why it's not doing much.



yea, I figured that much. I just thought I would mention it.


Mahan(Posted 2011) [#11]

Do you have a .exe of the C code, so we can compare it on our own computers?



GCC Version:

http://download.ecma.webfactional.com/a.exe

VC++ 2010 version

http://download.ecma.webfactional.com/speedTest1.exe


slenkar(Posted 2011) [#12]
try using normal code,
like
integer comparisons, divisions, instead of all that bitshifting stuff
NEW instead of malloc,

Last edited 2011


Jesse(Posted 2011) [#13]
logic operators as well as arithmetic operators are as optimized as possible in BMAX as in C. the thing that slows BMAX down is the function calls and all of the indirect calls to object's method and functions as constant calculations have to be done to obtain the objects address. that would be the same with c++. and it would all depend on the implementation of the programmer. some of BMAX code is slow such as the DrawImage function but that is because it is not optimized to its full potential. I agree with Skidracer that bmax is about 90% as fast as c. it's just how BMAX code is implemented that makes the difference.

Last edited 2011

the reason the array code above is slow is because arrays are objects and therefore calculations have to be made to obtain the location of the elements in the array and is the only thing that slows it down.

Last edited 2011


Mahan(Posted 2011) [#14]
I'm happy I wrote this line below in my edited post yesterday, so now I'm hopefully just an ignorant person instead of a complete jerk :)

From other thread: "I am however always ready to change these views if I'm presented with scientific data that proves otherwise."

It seems that it might very well be the way I wrote BMX code that mattered more for the speed of my applications than BMX's capabilities.


BlitzSupport(Posted 2011) [#15]
I think it's also worth pointing out that this is only testing one very tiny aspect of both languages, and in a very artificial context at that.


Mahan(Posted 2011) [#16]
Yes it might be a tiny aspect, but I'm still surprised in a positive way :)

I couldn't imagine an almost 3 fold speed increase on that small piece of code.


ima747(Posted 2011) [#17]
I'm very impressed with the results. BMax *should* (in theory) be slower for any number of reasons (simple rule of thumb is the simpler the syntax the harder the stuff between the programmer and the compiled executable has to work to get things speedy...). And bare in mind as well that even if it was/is half the speed of c/c++, on even a remotely modern system that should still be fast enough for almost all real world concepts (outside of niche fields like engineering etc. obviously, just talking in general).

Personally i also factor development time in as a function of execution time. If it takes me a year to get my program to run 5% faster that's not a trade off worth making (as an example). BMax, for many thing, should be a much faster language to develop for, so if time to market is a factor you always are willing to make sacrifices elsewhere (like execution speed...).

To see it do that well only pushes it that much higher on my list of preferred languages.


Czar Flavius(Posted 2011) [#18]
A test which creates and deletes (collects) objects would be more interesting, as this is a hard task and more important to a game's performance than a single loop. From what I have read on the forums here and there, BlitzMax manages its dynamic memory better than the native C++ implementation, which if true would be a nice advantage.

The biggest challenge to BlitzMax performance is not the lack of speed, but the lack of optimizations. For example, C++ can inline certain function calls removing the overhead of calling them. It would be great if BlitzMax did this kind of optimization too, but alas it does not!

Last edited 2011


Mahan(Posted 2011) [#19]
@ima747: idd. I'm going a bit OT now but in the IT-business world ppl are actually using languages like Python or Ruby just because of development speed, and those languages are a lot slower than both C++/C and BMX.

edit: And BMX is IMHO much nearer python than it is to C++ in ease of use. This is my opinion again, and I have no data to support it :)

Last edited 2011


Perturbatio(Posted 2011) [#20]
This is about the best I can do, and it only shaved another few millseconds off :(


Oh the shame...

Last edited 2011


Jesse(Posted 2011) [#21]
except for your own rand function there is nothing different. the bank bytes are created with MemAlloc and therefore are the same thing in root . since the rand function is not part of the speed test it really doesn't make any difference at all. if you are getting different speed it's only due to the programs running in the background.


Perturbatio(Posted 2011) [#22]
You know, my brain didn't even try to comprehend the code, it just went looking down it thinking, that'll be slower.

The main difference then is superstrict I guess


kiami(Posted 2011) [#23]
I am not sure I am correct, but this is the way I look at the issue:
No matter what, a good C++ code is faster than a good non-C++ code. But, I think performance is not relevant anymore for application programming. Companies who use C++ for applications (even games) use it for different reasons than performance, some are actually stuck in it. How about game programming? It has evolved. Some part of it is not application programming anymore. Some higher level of game production doesn't involve complex coding at all, and some can be done by any scripting language. I think we are in this situation now: hardware-> low level API - > games engine -> game design GUI-> scripting. Where is C++? low level API and game engine, where is C# or Blitzmax and the others non-C languages? game design GUI. Scripting? anything: python, JavaScript, etc.


AdamRedwoods(Posted 2011) [#24]
Using "global" is notoriously slow in Blitzmax.

My results,
run from the command line for both without touching the mouse,
on a stock Dell Win7 64bit, Intel Core2 Quad CPU Q8200 @ 2.33GHz
blitzmax:
H:\_work\software_dev\_blitzmax\fun\speedtests>z_speed_test.exe
Iterations took 46.000000000000000 seconds

mingw g++:
H:\_work\software_dev\_blitzmax\fun\speedtests>a.exe
Iterations took 33.737 seconds



I compiled "a.exe" using the above code from Mahan and mingw installed
g++ simple_speedtest.cpp -O2


I compiled Blitzmax using the following code (minor changes) with DEBUG turned off and GUI turned off and THREADING turned off and I ran it from the same command line as "a.exe" (not in the IDE):


Note, if I didn't optimize anything on the g++ side, then Blitzmax wins (i think by almost 80 secs).
Is this cheating since "bcc.exe" does not have it's own optimization options?

Some parts of Blitzmax (some C libs) can have optimizations as shown here:
http://www.blitzmax.com/Community/posts.php?topic=84944


Kryzon(Posted 2011) [#25]
Relevant thread: http://blitzbasic.com/Community/posts.php?topic=83720


Mahan(Posted 2011) [#26]
@AdamRedwoods

g++ simple_speedtest.cpp -O2



Why would you not use the best optimization (-O3) in a speed test?

When you use BMX in release-mode that is certainly it's "-O3" with all possible optimizations enabled.


AdamRedwoods(Posted 2011) [#27]

Why would you not use the best optimization (-O3) in a speed test?



I think the "unaltered" bmk_make.bmx source file code says:
If opt_release cc_opts:+" -O2 -DNDEBUG"
so I was trying to compare equally.

That's why i posted the link since there were past discussions that were into using -O3, but you'd have to recompile the BRL mods as well.
This is just for mods, though, since the actual BMX code is compiled directly to FASM assembly, so any optimizations done is done by the author(s) within BCC.exe.

"simple_speedtest.cpp" with -O3 and -ffast-math:
Iterations took 33.756 seconds
huh...so no significant increase on my machine.


zzz(Posted 2011) [#28]
Heres my attempt:


Always fun with a optimisation challenge but i personally think topics like these are kind of pointless. It all ends up as the same language in the end :)


Brendane(Posted 2011) [#29]
Doesn't vc++ express only compile .net binaries?


ziggy(Posted 2011) [#30]
No. VC++ compiles native (non CLR) applications. Is the Microsoft C++ compiler implementation as always. It's the most efficient C++ compiler for windows and highly recommended. Much better than the MinGW compiler in my opinion.

If you want to create .net application, then you may use Microsoft C# Express or Visual Basic .net


Brendane(Posted 2011) [#31]
I haven't used vc since visual studio 2005... I did download and install the current express edition though and it definitely does compile for CLR...but I remember what confused me now..it just doesn't come with MFC...but of course the standard native sdk is present...ignore me :)

Microsoft really know how to keep bloating software to the point of irritation, I'll give them that.


Czar Flavius(Posted 2011) [#32]
After using the top notch intellisense in Blide, the Visual Studio 2008's pitiful intellisense really drives me insane.


Blitzplotter(Posted 2011) [#33]
a good C++ code is faster than a good non-C++ code.


This thread has got me considering dusting off my C portion of my noodle, which is after all - non C++. Thing is, I'm a BMax'err as well - and having dipped my toe into the murky pond of C for work purposes a while ago am intrigued by this thread, time for play.

Microsoft really know how to keep bloating software to the point of irritation, I'll give them that.
Yeah, the bloatware is kind of crap (I've got a couple of version of VS bouncing about due to the different flavours of college I've entertained on my broken path to a BSc). I wonder if there is threads justifying the lack of reduction of bloatware on their behalf - if anyone has the capacity to reduce bloatware surely they have, sorry, I digress.

I struggle to 'play' as much as I want these days with code, life stuff is coming into play more & more. I look forward to compiling the examples above on my 'main' PC, peace ;)


Me.32(Posted 2011) [#34]
C++: Iterations took 18.9313 seconds (with pointers instead of array index, llvm 1.5 compiler)


Mahan(Posted 2011) [#35]
@blubberwasser:

Unless you own exactly the same computer as I do you should post the timing of the best BMX code also, otherwise 18.9313 is just an arbitrary number. :)