Code archives/Algorithms/Extract Text

This code has been declared by its author to be Public Domain code.

Download source code

Extract Text by TAS2012
Yet another text splitter. Its perhaps a littile more flexiable and robust than previous splitters as it accepts and output arrary as a parameter and wil redim it if needed.
'extract.bmx
'extracts data out of delimited string
'and puts it into am array

'Thomas A Stevenson
'war-game-programming.com
'11-16-2012

'example of use
Local s$[3]
Extract("Dog,Cat,Deer,Pig,horse",s)
For i=0 To s.length-1
	Print s[i]
Next

Function Extract(s$,stringArray$[] Var,seperator$=",",i=0)
	'leading or terminal seperator returns an empty string
	'i = array index to store first segiment
	k=1	'start position
	Repeat
		'find location of seperater
		j=Instr(s$,seperator,k)
		If j=0
			'terminal segiment
			'Print String(j)+"  "+Mid(s$,k)
			stringArray[i]=Mid(s$,k)
			Exit
		EndIf
		
		'Print String(j)+"  "+Mid(s$,k,j-k)
		stringArray[i]=Mid(s$,k,j-k)
		'increament and check array index
		i=i+1
		If i=stringArray.length Then 
			'redim array
			stringArray=stringArray[..i+1]
		EndIf
		
		k=j+1	'inc start position
	Forever
End Function

Comments

misth2013
I've tried BMax demo many times (with many different PCs :DD), and I found out that it already have Split() function for strings.

I have this old code here, that is quite fast. It has 2 functions; GetWord and CountWords. Speed-test on my slow laptop:
GetWord test start
500000 GetWord tests in 1245ms
CountWords test start
500000 CountWords tests in 699ms


And the code itself:
SuperStrict


' Return word from sentence
Function GetWord:String(_line:String, index:Int, separ:String = " ")
	Return _line.Split(separ)[index - 1]
End Function

' Return word count in sentence
Function CountWords:Int(_line:String, separ:String = " ")
	Return _line.Split(separ).Length
End Function

Local line:String = "I am a sentence. You can split me any way you want."
Const TESTS:Int = 500000
Local start:Int, _end:Int

For Local i:Int = 1 To CountWords(Line)
	Print GetWord(line, i)
Next

Print "GetWord test start"
start = MilliSecs()
For Local i:Int = 1 To TESTS
	GetWord(Line, 1)
Next
_end = (MilliSecs() - start)
Print TESTS + " GetWord tests in " + _end + "ms"

Print "CountWords test start"
start = MilliSecs()
For Local i:Int = 1 To TESTS
	CountWords(line)
Next
_end = (MilliSecs() - start)
Print TESTS + " CountWords tests in " + _end + "ms"



Code Archives Forum