arrays as variables

BlitzMax Forums/BlitzMax Programming/arrays as variables

Warpy(Posted 2008) [#1]
I've written something that uses my JSON library to read and write save files. The JSON decoder uses a lot of functions which take arrays as arguments, and parsing a ~1mb file takes over two minutes. So, lots of optimisations to be made, and I thought I'd start with seeing if passing arrays around was a bottleneck.

Here's a test:
SuperStrict 

Function f1:Int(arr:Int[])
	Local t:Int,i:Int
	t:Int=0
	For i:Int=EachIn arr
		t:+i
	Next
	Return t
End Function

Function f1local()
	f1([1,2,3])
End Function

Global carr:Int[]=[1,2,3]
Function f1const()
	f1(carr)
End Function

Function f2const:Int()
	Local t:Int,i:Int
	t:Int=0
	For i:Int=EachIn carr
		t:+i
	Next
	Return t
End Function

Function time(func:Int(),name$)
	Local oms:Int,ms:Int,c:Int
	oms:Int=MilliSecs()
	For c:Int=1 To 10000000
		func
	Next
	ms:Int=MilliSecs()
	Print name+" took "+(ms-oms)/1000.0+" seconds"
End Function


time f2const,"don't pass const array"
time f1const,"pass array, const"
time f1local,"pass array, local"


In the first test, the function refers to the array directly, it is not passed in as an argument.
In the second test, an array is passed in, but it is always the same global array object.
In the third test, the array is created inside the function call, so it is created every time the function is called, as blitz can't do constant arrays.

Here are my results, with debug on:
don't pass const array took 5.81400013 seconds
pass array, const took 5.87099981 seconds
pass array, local took 8.67500019 seconds


and with debug off:
don't pass const array took 0.254000008 seconds
pass array, const took 0.310000002 seconds
pass array, local took 3.26399994 seconds


(As an aside, turning superstrict off when debug build is enabled makes it run much faster. That seems odd to me.)


So, clearly, initialising arrays takes a lot of time. Can't bmk be made to recognise arrays consisting solely of string/number literals and only initialise them once? I can see it would probably take a little bit more analysis when building, but I think it'd be worth it.


PS While I'm here, can anyone think of a way of writing a generic function to get the next character in a string belonging to a certain set which *doesn't* involve using arrays?


plash(Posted 2008) [#2]
can anyone think of a way of writing a generic function to get the next character in a string belonging to a certain set which *doesn't* involve using arrays?
What exactly do you mean by 'belonging to a certain set'?

You can get the ascii of a character in a string by using it like an array.. 'Print Chr(mystring[i])'


Warpy(Posted 2008) [#3]
yeah yeah, I haven't written that very well. What I mean is, I've got a string, for example:
" {a: [1,2,3] , b: "hello" } "

and at a certain state in the parser I'll want to look for, for example, a comma or an end of curly braces character.

So what I'm doing at the moment is calling a method 'getnext( tokens[] )' which scans along from the current index in the string until either a character in tokens[] or the end of the string is reached.

Here's the getnext function:
Type jsondecoder
	Field txt$
	Field i
	Field curchr
	Field things:TList
	
	Method getnext(tokens[],onlywhitespace=1)
		oldi=i
		While i<Len(txt)
			c=txt[i]
			i=i+1
			For token=EachIn tokens
				If c=token 
					curchr=c
					Return 1
				EndIf
			Next
			If onlywhitespace And (Not (c=32 Or c=9 Or c=10 Or c=13))
				i=i-1
				Return 0
			EndIf
		Wend
		i=oldi
		Return 0
	End Method

'...... the rest of the type

End Type



plash(Posted 2008) [#4]
So tokens is an ascii code array, and you want to use a string to hold the codes instead of an array (improving execution time)?


Warpy(Posted 2008) [#5]
Yep, tokens is an ascii code array, and I just wanted to know if there was a vastly quicker way of doing it.

It's a moot point now anyway - declaring the token arrays as Globals got me from 150 seconds down to 5 seconds, and thne using ReadString to read in my save file instead of ReadLine got me to 0.7 seconds, which I think is fast enough.


Kurator(Posted 2008) [#6]
hm, would it be faster to pass the token array in as a pointer (reference)?


plash(Posted 2008) [#7]
hm, would it be faster to pass the token array in as a pointer (reference)?
I thought that's what arrays did by default? Anyways, I think he was passing it like this aFunction(["token", "token", "tokens...etc"]) so for some reason the compiler decided to create a new array everytime that call was made instead of recognizing the constantiality (is that a word? :P) of the array being passed.


Kurator(Posted 2008) [#8]
I thought that's what arrays did by default?


You are right :) They are passed as reference / pointer by default


ImaginaryHuman(Posted 2008) [#9]
You could also try allocating memory and then using pointers instead of arrays, as array access has an extra step of indirect accessing - it has to go lookup the pointer to the array's memory from within the array object, which you don't have to do with a pointer to memory.


big10p(Posted 2008) [#10]
I'm probably missing something here but given an array Local myarray:Int[] = [1,2,3,4,5], then myarray is a pointer to the array (i.e. the first element of the array).

Local a:Int[] = [1,2,3,4,5]

foo(a)

End

Function foo(a:Int Ptr)

	For Local i:Int = 0 To 4
		Print a[i]
	Next
	
End Function



Jesse(Posted 2008) [#11]
I would have never tried that! Nice to know. Although, this way it doesn't do out bound check.


ImaginaryHuman(Posted 2008) [#12]
An array is an object, a custom type, containing fields, and one of those fields is a pointer to some allocated memory for storage of the array data, which itself may be an array if you have multiple dimensions. You therefore have an indirect jump in order to get to where the data is really stored. With a pointer to some allocated memory you can skip the (possibly minor) overhead introduced by the array object.

Also in big10p's example, myarray is not a pointer to the memory of the array, it's a pointer to the array *object* which contains a pointer to the memory. The syntax for myarray[0] and mybyteptr[0] looks the same but it's doing different things once compiled.


big10p(Posted 2008) [#13]
Also in big10p's example, myarray is not a pointer to the memory of the array, it's a pointer to the array *object* which contains a pointer to the memory. The syntax for myarray[0] and mybyteptr[0] looks the same but it's doing different things once compiled.
True, but I was just pointing out you can use the array name in bmax as if it were a pointer to the raw data, a la C. Even the docs shows this with the example given for Ptr:

Rem
Ptr is a composite type containing a pointer to a variable of the specified Type.
End Rem

' the following illustrates the use of traditional c style pointers

Local c[]=[1,2,3,4]
Local p:Int Ptr

p=c
Print "pointer 'p' points to:"+p[0]	'1

p:+1
Print "pointer 'p' points to:"+p[0]	'2

p:+1
Print "pointer 'p' points to:"+p[0]	'3

p:+1
Print "pointer 'p' points to:"+p[0]	'4