Slow string splitting in BMX-NG

BlitzMax Forums/Brucey's Modules/Slow string splitting in BMX-NG

Hezkore(Posted 2016) [#1]
I'm reading through over 1300 scripts in one of my projects.
Using normal BlitzMax that takes about 100 or so MS.
But when I'm using BlitzMax-NG it takes almost 5 seconds.
So I decided to do a very inefficient text splitter test to see how the two versions compare to each other, and it turns out that NG is very slow when it splits strings.
Here's the test
SuperStrict

Local testString:String

For Local i:Int = 0 To 100
	testString:+"This is a very long string containg a few spaces as a test, I'll split this up and test how long it takes"
Next

Print "START!"
Local startMS:Int = MilliSecs()
Local tmpPart:String
For Local i:Int = 0 To testString.Split(" ").Length - 1
	tmpPart = testString.Split(" ")[i]
Next

Print "DONE - " + (MilliSecs() - startMS) + "ms"
End

'Normal BlitzMax = 367ms
'BlitzMax - NG = 8791ms
I ran this on Windows as a x86 executable.

UPDATE: I ran it as a x64 executable now and got 13707ms.
So it seems that x64 only makes it slower.


Derron(Posted 2016) [#2]
I got other times:

vanilla: 596ms
NG: 1956ms

BUT ... when replacing
For Local i:Int = 0 To testString.Split(" ").Length - 1

with

Local count:Int = testString.Split(" ").Length -1
For Local i:Int = 0 To count


NG time decreases to 1350ms. This sounds as if the conditions of the for loop are evaluated multiple times.
Edit: that condition-thingy seems to be a bug - opened an issue on this at the github-project-site.

bye
Ron


Hezkore(Posted 2016) [#3]
Yeah the times will change depending on what computer you've got.
And yes, you can do loads of stuff to make this faster, but I tried to make this as slow as possible to test with.
But my point is that NG is obviously much slower, and my script reader is already pretty well optimized (unlike this example), so it's a bit of a problem right now.


Derron(Posted 2016) [#4]
No - I am not talking about the total time - but about the difference-factor.

Mine: NG is needing 3.2x as long.
Yours: NG is needing 24x as long.

So on my linux box the compilation seems to be more optimized than yours.

bye
Ron


Derron(Posted 2016) [#5]
Replacing the for-part with this:
Local count:Int = testString.Split(" ").Length -1
For Local i:Int = 0 To count
	Local splitted:String[] = testString.Split(" ")
	tmpPart = splitted[i]
Next


results in 600ms on my computer (so up to 0 difference)

The reason is something "special". When NG converts your code to C, it results in "split" to be called twice:


//BMX: tmpPart = testString.Split(" ")[i]

bbt_tmpPart=((BBSTRING*)BBARRAYDATA((bbStringSplit(bbt_testString,&_s2)),(bbStringSplit(bbt_testString,&_s2))->dims))[bbt_i2];


//BMX: Local splitted:String[] = testString.Split(" ")
//     tmpPart = splitted[i]

BBARRAY bbt_splitted=bbStringSplit(bbt_testString,&_s2);
bbt_tmpPart=((BBSTRING*)BBARRAYDATA((bbt_splitted),(bbt_splitted)->dims))[bbt_i2];


Let's see how Brucey tackles this.


bye
Ron


Brucey(Posted 2016) [#6]
Interesting :-)

I get, on average :
Normal BlitzMax = 255ms
BlitzMax - NG = 750ms (x64)

With 'fixed' NG, I am now getting :
BlitzMax - NG = 253ms (x86)
BlitzMax - NG = 230ms (x64)

This is on OS X (El Capitan), with modern compiler settings - i.e. -march=nocona and -msse3
These kind of settings on OS X generally result in NG code always running faster than the legacy BlitzMax compiler, and more so when built for x64.

so it's a bit of a problem right now.

Well, you found a bug :-)
Thanks for reporting it!