Getting vars from a String....

Blitz3D Forums/Blitz3D Beginners Area/Getting vars from a String....

neos300(Posted 2009) [#1]
Hey, I am trying to improve a scripting language I made earlier, and i am very, very, very stuck on the string parsing things.
(eg: print "Hey, my name is " + whipster + "!" would come out with Hey, my name is joe! if whipster equaled "joe")

Can anyone help me with this? I tried to write my own, but i failed epiclly
Function parseStrVars(sstring$)
Local retstr$
If Left$(sstring$, 1) = dqt$
	;retstr$ = Mid$(sstring$, 1, Len(
	blah$ = sstring$
	.go
	firstspace=Instr(sstring," ",1)

	If firstspace <> 0
		word$=Left(blah$,firstspace-1)
		blah$=Mid$(blah$,firstspace+1)
	Else
		word$ = blah$
		blah$ = ""
	EndIf
	retstr$ = retstr$ + Mid$(word$, 1, Len(word$) - 2)
	Print retstr$
	If Right(word$, 1) = dqt$
		firstspace=Instr(blah," ",1)

		If firstspace <> 0
			operator$=Left(blah$,firstspace-1)
			blah$=Mid$(blah$,firstspace+1)
		Else
			operator$ = blah$
			blah$ = ""
		EndIf
		If operator = "+"
			firstspace=Instr(sstring," ",1)

			If firstspace <> 0
				var$=Left(sstirng$,firstspace-1)
				blah$=Mid$(sstring$,firstspace+1)
			Else
				Print "ZOMG! ERROR!"
				var$ = sstring$
				blah$ = ""
			EndIf
			retstr$ = retstr$ + GetEnv$(var$)
			firstspace=Instr(sstring," ",1)

			If firstspace <> 0
				op$=Left(blah$,firstspace-1)
				blah$=Mid$(blah$,firstspace+1)
			Else
				op$ = blah$
				blah$ = ""
			EndIf
			If op$ = "+"
				Goto go
			Else
				Print "ZOMG! ERROR!"
			EndIf

		Else
			Return
		EndIf
	Else
		Goto go
	EndIf
Else
	Print "ZOMG! ERROR!"
EndIf

See, my code fails!


Yasha(Posted 2009) [#2]
Erm... that code looks incomplete.

Assuming that all that's missing is the final "End Function", the first line in that function compares the leftmost character of the argument to the variable dqt. Except that since it's the first line, dqt hasn't been assigned to anything, and dqt = "". So if the argument has any parameter other than "", the function has no option but to return an error.

Could you outline in prose exactly what it is that this function is intended to do, step by step?

And I know you didn't like to hear this before (so I won't bother you with it again) but seriously, you're doing this the hard way. Trying to interpret text commands is about a hundred time harder than just writing a simple compiler and interpreting the result of that.


neos300(Posted 2009) [#3]
Ok...
But first 2 things:
1. the dqt variable is simply a reference for a "
2. This is going to be blitz rewrite of lolcode, do it kinda has to be interperted, to keep with tradition.

So, you plug this in:
var chip$ = "Chippy"
print "Hi " + chip$ + "!"


and the code outputs Hi Chippy!
So I am wondering how to do the operators to get the variables.


Yasha(Posted 2009) [#4]
Well, the most important problem here is that you're simply detecting the next space in the line and using it as a separator, regardless of whether it's inside a string or not. The line
print "Hi " + chip$ + "!"

will never work as expected because you're detecting the first token as being ( "Hi ), then checking for a double quote on the end of it, which is nonexistent.

There are a couple of other problems with the function as it stands - when you Goto your way back to the top of the function (urgh, that hurts) it reads from the original string again, which resets the separator-pointer and can therefore cause an infinite loop, rather than moving a pointer to the start of the next token or replacing the string with a shortened one (equivalent). You also don't seem to have an actual success condition, so it'll print "ZOMG! ERROR!" regardless.

To be perfectly honest I would still not attempt to do it in this way (and no, I am not about to recommend compiling it). Even a comparatively simple language is going to have far more variations in the instruction lines than you could reasonably expect to predict in this rather rigid way (explicitly assigning one block to grab the string literal, another to grab the operator, etc). What I would strongly suggest you consider is building an entirely separate tokenising function, that returns to you the "next" token in the current string, file, or whatever form this data is actually in, as well as a token type ID to let you know whether it's a string literal or operator or something else. You'd need to maintain your current position in the list/string somehow, with a counter, so that it always pulls out the next token with a simple helper function call like "NextToken()". I'd recommend this being in a separate function so you can avoid writing the same code multiple times (in the example above you've got essentially the same tokenisation step rewritten three times).

You'd also need to build into that function some way of identifying a string literal and grabbing it all, such as setting a flag when you hit the first quote mark and continuing to copy all characters until you hit the second, at which point you could stop grabbing a token (for other types, you'd stop when a character invalid for that token type was encountered).

The other thing I'd recommend is trying to avoid a reliance on single-spaces as separators; people might use tabs, or many spaces, or none at all, and in most languages that'd be considered perfectly valid. The simplest (extremely primitive) way around this is to make extensive use of the Trim() function, that can simply remove unnecessary whitespace at either end of a string for you; you might also want to Replace() tab characters with spaces.

Anyway, a ridiculously overcomplicated example of what I mean can be found here (that's the one in the codebox entitled "Lexer.bb").

Sorry once again to suggest doing something completely different... I hope this one's a little more helpful.


_PJ_(Posted 2009) [#5]
I recommend using the Instr() command. Perhap[s Replace() too may be handy.