String split function

BlitzMax Forums/BlitzMax Programming/String split function

Bot Builder(Posted 2005) [#1]
Well I stopped trying to make a string split method in C, so i've made the ultimate splitter in bmax instead:
'#Region Variables
Global EFlags:Byte[]=New Byte[512]						'Token info - stores flags for 256 tokens :)

Const SPLIT_Case=1										'Make it case semsitive

Const SPLIT_Leave=2										'Add the text the splitter matches to elements 
														'between non-splitter elements		

Const SPLIT_Flags=4										'Flags appended to each element determine, defaults
														'are determined by adding the other flags to this one
'#end region

'#region Info
'For the splitterflags flag, append the following flags to the end of the string seperators, they must
'be the last things in the string (note only two allowed per element):

'~~caseon for casesensitivity
'~~caseoff for none
'~~leaveon to leave the splitters in
'~~leaveoff to abandon them

'These will override the defaults for the splitter.

'You dont really need these element flags most of the time, its mostly for stuff im doing specifically
'#End Region

Function Split:TList(txt$,Splitters:String[],Flags:Byte=0)
	Local i,j,k
	Local DefaultCase=Flags & SPLIT_Case				'See above
	Local DefaultLeave=Flags & SPLIT_Leave				'^
	Local ElementFlags=Flags & SPLIT_Flags				'^
	Local CaseOn										'For each splitter
	Local LeaveOn										'^
	Local Prev=0										'End of previously found token
	Local Results:TList=New Tlist						'Dynamic list of results
	Local Low$											'If necessary, a var holding lowered version
	If DefaultCase=0 Or ElementFlags Then Low$=Lower(txt)'Set it
	For i=0 Until Splitters.Length						'Loop through splitters
		CaseOn=DefaultCase
		LeaveOn=DefaultLeave
		If ElementFlags Then										'Find parameters
			n1=Splitters[i].findlast("~~")							'1st param
			n2=Splitters[i].findlast("~~",Splitters[i].Length-n1+1)	'2nd param
			Select Splitters[i][n1+1..Splitters[i].Length]			'Check 1st one
				Case "caseon"
					CaseOn=1
				Case "leaveon"
					LeaveOn=1
				Case "caseoff"
					CaseOn=0
				Case "leaveoff"
					LeaveOn=0
				Default
					If Not CaseOn Then Splitters[i]=Splitters[i].ToLower()		'Preprocess :)
					EFlags[i Shl 1]=CaseOn										'Set the databank
					EFlags[i Shl 1 + 1]=LeaveOn
					Continue
			End Select
			Splitters[i]=Splitters[i][0..n1]
			Select Splitters[i][n2+1..n1]							'Check 2nd one
				Case "caseon"
					CaseOn=1
				Case "leaveon"
					LeaveOn=1
				Case "caseoff"
					CaseOn=0
				Case "leaveoff"
					LeaveOn=0
			End Select
		EndIf
		If i=256 Then Throw "Less than 257 seperators please"		'Prolly not gonna happen
		EFlags[i Shl 1]=CaseOn										'Set the databank
		EFlags[i Shl 1 + 1]=LeaveOn
		If Not CaseOn Then Splitters[i]=Splitters[i].ToLower()		'Preprocess :)
	Next
	For i=0 Until txt.Length										'Loop through the chars
		For j=0 Until Splitters.Length								'Loop through the splitters
			CaseOn=EFlags[j Shl 1]									'Get flags
			LeaveOn=EFlags[j Shl 1 + 1]
			If CaseOn Then
				If Splitters[j]<>Txt[i..i+Splitters[j].Length] Then Continue
			Else
				If Splitters[j]<>Low[i..i+Splitters[j].Length] Then Continue
			EndIf																'Continue if not a match
			If i-prev>0 Then Results.AddLast(Txt[prev..i])
			If LeaveOn Then Results.AddLast(Txt[i..i+Splitters[j].Length])
			prev=i+Splitters[j].Length
			Exit
		Next
	Next
	If Txt.Length-prev>0 Then
		Results.AddLast(Txt[prev..Txt.Length])
	EndIf
	Return Results
End Function

'#Region Example
Local a:TList

Print "Start"
a=Split("Function Split:TList(txt$,Splitters:String[],Flags:Byte=0)~nLocal i,j,k~nLocal DefaultCase=Flags & SPLIT_Case",[" ~~leaveoff","~n",":", "(", ")", ",", "[", "]","=","+","-","/","*","@@","@","#","$","%","^","&","~~"],SPLIT_Leave|SPLIT_Flags)
Print "End"

For Local t$=EachIn a
	Print t
Next

'#end Region



Beaker(Posted 2005) [#2]
I like it.


Bot Builder(Posted 2005) [#3]
Ok, I've added the element flags stuff, so now its probably the most bloated split function ever :) But thats ok because most of the time using it is simple unless you want it to do alot for you like I have it doing in this example.

After trying a loop of the split function, i've noticed its pretty slow, which wont work well at all in my application. I'm pretty sure its because of all the arrays it creates, so I'm going to try to make the temp arrays global. The only side effect is that there is a maximum token limit.


Bot Builder(Posted 2005) [#4]
Ok, I've changed my whole method. This one is fast enough for my purposes. Final ver prolly


Beaker(Posted 2005) [#5]
I dont like it anymore. :D


Tom(Posted 2005) [#6]
LOL


Bot Builder(Posted 2005) [#7]
Lol. It still works nearly the same, beaks, that example is just over-complicated since its doing quite alot at once. Simpler example:
Local a:TList=Split("Oooh this is a sentance, cool eh?",[" "])

For Local t$=EachIn a
	Print t
Next
If you want it exactly as before you can rename this function, to SplitToList for instance, and write a small function:
Function Split:String[](txt$, Splitters:String[])
	Local lst:Object[]=SplitToList(txt$,Splitters).ToArray()
	Local res:String[]=New String[lst.Length]
	For i=0 Until lst.Length
		res[i]=lst[i].ToString()
	Next
	Return res
End Function

Local B:String[]=Split("Hi, how are you?",[" "])

For Local i=0 Until B.Length
	Print B[i]
Next
Cant satisfy everyone eh ;) well, right now priority is my uses.