BMK NG - Turbo Boost

BlitzMax Forums/BlitzMax NG/BMK NG - Turbo Boost

Brucey(Posted 2009) [#1]
I've been playing around with BMK some more... this time, looking at ways to speed up the build process.

Looking at the way BMK works, the obvious place to try to speed things up, is to compile multiple C/C++ files at the same time.
Why not .bmx files? Well, they can have dependencies on each other, so you have to get the build order spot-on. But with C/C++, so long as you don't try to link/archive before they've finished compiling, it doesn't really matter what order they are compiled in.

It turns out that C++ files get the biggest benefit (over C) because they each take quite a bit longer to compile.

Here are some basic benchmarks. My BMK is set to multi-build cores+1 (which on my Mac Mini = 3). So, that means, at any one time, up to 3 C/C++ files would be compiling...
(Note that these tests were using "makemods -a" : i.e. full rebuild of the module)
BaH.box2d

Normal
33.114
29.333
29.100

Threaded
20.856
20.523
20.602

Pub.Libpng

Normal
13.481
15.551
12.325

Threaded
8.500
8.648
8.649

As you can see, compile times do improve, albeit only by a few seconds

For some C-only libs, the difference is not so great
BaH.libcurl

Normal
32.914

Threaded
30.689


However, for large, C++ libraries, there can be quite a significant improvement :
BaH.CEGUI

Normal
6m8.967s

Threaded
3m34.058s


There do appear to be some issues with the multi-threaded GC though, as occasionally I get some errors :
bmk.mt(16044,0xb0123000) malloc: *** error for object 0x489: Non-aligned pointer being freed
*** set a breakpoint in malloc_error_break to debug

...whatever that means!
(Either that, or it's my threading code... which is highly possible)

Anyhoo... just another little experiment, trying to eek a bit more out of BMK...

:-p


Armitage 1982(Posted 2009) [#2]
Updating CEGUI from SVN is probably the longest process existing (except maybe for wx which I didn't use currently).
Only 6 minutes ? Sounds like 15 to me ^^

Brucey, you seem to take the best of BMK these days :)


Brucey(Posted 2009) [#3]
Just trying to be more productive...
... sitting around waiting for things to compile (or having to fiddle with binaries, re Universal)... well, it's not really what programming should be about.

I tend to work like this :

* Code
* Compile
* Test
* Code
* Compile
* Test
... etc

which means I want least effort in the "Compile" and "Test" parts, so I can instead concentrate on coding :-p


I just tried my tweaks on Linux... get a double-free every time - which is nice (no, not really) , cuz hopefully I can track it down to something more useful.


Mark Tiffany(Posted 2009) [#4]
get a double-free every time

Really? ;-)


Brucey(Posted 2009) [#5]
Really? ;-)

Heh... it's like deja vu all over again...


*sigh*
This low-level hacking stuff is trying...
It appears that calling system() from many threads isn't such a great idea after all. I guess there's stuff going on in there that acts in a global scope.

However, I've found I can fork() and then call system(), which appears to mostly work - well, at least all the "free" errors have gone away.

... almost there :-p


Brucey(Posted 2009) [#6]
Results from my Quad-core Linux box (running 5 processes) :
BaH.CEGUI

Normal
3m48.192s

Threaded
1m15.374s

It took 1/3 of the time of the normal run. Not bad.

No crashes. No errors... now that I'm forking off the processes ;-)

Groovy.


Armitage 1982(Posted 2009) [#7]
Do you think this could be part of official release someday ?
Since RC5 things are a bit quiet in this area.
But the real question is : am I ready to save 2/3 of my time by compiling stuff Faster ?
What will happen ? 2/3 more time to spend money on dumb things with madame ? Haa Hem..

Anyways good tweak !
Next step : your very own compiler :-p


Brucey(Posted 2009) [#8]
Do you think this could be part of official release someday ?

Doubtful.

It's not very useful for the official modules since there are so few of them, and there's not very much C++ code to compile. On my Mac Mini :
Pub.*
Normal
1m48.654s

Threaded
1m26.759s


But for me, and my reasonably large libraries (BaH.GDAL, for example, on my Quad Core) :
Normal
7m51.076s
7m48.870s

Threaded
2m55.982s

It can make a huge difference.


slenkar(Posted 2009) [#9]
it could help with wxmax and those kinds of things


Brucey(Posted 2009) [#10]
it could help with wxmax

Not in this case, since there's only usually 1 .cpp file per module. So, it's use is probably quite limited.

While I'm beefing things up a bit, I've added the ability for BMK to load a "custom.bmk" file from the bin folder. Basically, if you want to override some default settings, you can create the file and put them in there.

For example :
addccopt optimization -O3

Replaces the default optimization (-Os) with level-3 optimization, for all C/C++ files.

Since "addccopt" is global for all platforms, you can also use platform-specific calls...
addwin32ccopt arch -march=pentium3

To generate pentium3 object code on Windows, for C/C++ files.

(Of course, if you were working only on Windows, you could use a basic "addccopt" call instead).

Since .bmk files are also processed as LUA, you can @define your own lua functions and call them as part of the build...

Anyhoo... the two example CC options above are likely to produce much faster code than the default compiler options of -Os and -march=pentium
(You could even go crazy and use -march=prescott, but you are then limiting the user-base for running your binaries on).

:o)


Tachyon(Posted 2009) [#11]
BRL seriously needs to come to an agreement with Brucey to get some of these recent BMK improvements into the official release.

If not, at the rate Brucey is going, he'll eventually be developing his own competitive programming language. (Think about it, Brucey!! :D )


DavidDC(Posted 2009) [#12]
Why not .bmx files? Well, they can have dependencies on each other, so you have to get the build order spot-on

Brucey what if you used/created a utility to map the .bmx build order. Your bmk accelerator could then reference that.

Don't you already have a grapher that outputs the bmx dependency tree? Could that be converted in some way?


Brucey(Posted 2009) [#13]
Brucey what if you used/created a utility to map the .bmx build order.

"Boost" has some nice graphing tools for making all this very easy to work out, but I'd rather not introduce more third-party modules into BMK.
(I was happy enough just to get the lua modules part of the main distribution)

But I'm pleased with this current incarnation of the system, as it removes oodles of time when building my modules.
And if I want a really quick build of my modules for Win32, I can now build them on my quad-core Linux box ;-)