Code archives/Algorithms/Generic parsing and scanning framework
This code has been declared by its author to be Public Domain code.
Download source code
| |||||
This is a port-and-expansion of my previous generic lexer and parser libraries for B3D, to BlitzMax. BlitzMax's enhancements open up a whole array of ways to improve the syntax of such tools, and I think we're more or less at the maximum: the TMetaParser class can now accept input in a syntax that roughly-approximates BNF, making this about as light as it's possible to get. (There's also a "simple" TParser class that accepts plain combinators in a similar fashion to the B3D version - lightweight compared to other libraries, positively cumbersome compared to TMetaParser.) It's called MetaParser because the implementation of the BNF layer is more or less metacircular. Minimal example (demonstrates both TMetaParser and TLexer): (Subtraction, division, functions etc. are omitted to keep the example short. See the original parser example for a fuller grammar.) Usage is directly derived from its predecessor libraries. A lexer is built out of rules, consisting of a regex, an action on match, a result, and potentially a mode constraint. Lexers can scan strings or files, and return an array of TTokens. See the B3D original for more information. A parser is created by extending TParser or TMetaParser, and listing rules. TMetaParser is more lightweight and is used in the example above: a field named "Grammar" is added to the class and its metadata contains the grammar for the parser, in BNF-style syntax. (Additions: a "!" can be added to the middle of rules to establish an "error point", and optionally rules can end with a colon followed by a filter string. Both of these additions are optional and explained in the B3D original entry. Terminal symbols are represented as a % or %! followed by their lexer symbol name.) The only major advantage of the combinator-based TParser over TMetaParser is that filters can also be applied to sub-rules, but if you need to filter sub-rules (and anyway, filters are largely an unnecessary extra), consider factoring out the nested expression to its own rule. (TParser is also obviously used in TMeta's internal compiler; you can see it in use there.) To use, instantiate your TMetaParser subtype and call the Parse method on a TToken array produced by the relevant lexer (lexer and parser terminal types need to match, so make sure the two are designed together). The first rule in the list will be treated as the "root", and used to match the whole document. This should provide a route to rapid language development in BlitzMax. It's finally concise enough to look nice on the page. Enjoy! (This really ought to be at least three separate files; the natural splitting points are marked. It works OK as one though.) Requires BaH.Regex. | |||||
' Generic parsing and lexing system, with BNF-style lightweight input '===================================================================== Import BaH.Regex SuperStrict Import Brl.Retro Import Brl.LinkedList Import Brl.Map Import Brl.Reflection Public ' TMetaParser '============= Type TMetaParser Extends TParser Abstract Field _rules:TMap, _start:String, _linked:Int Global pp:PMeta Method Parse:TParseNode(toks:TToken[]) If Not _linked Then Link() Return Super.Parse(toks) End Method Method Link() Local metaData:TToken[] = MetaLexer.ScanString(TTypeId.ForObject(Self).FindField("grammar")._meta) _rules = CreateMap() If pp = Null Then pp = New PMeta 'Not safe to initialize globally For Local i:Int = 0 Until metaData.Length Step 2 Local val:String = metaData[i + 1].value[1..] AddRule(metaData[i].value, val[..val.Length - 1]) Next _start = metaData[0].value _linked = True End Method Method AddRule(key:String, rs:String) Local rp:String[] = rs.Split(":") Local rule:TParseRule = CompileRule(pp.Parse(BNFLexer.ScanString(rp[0]))) If rp.Length > 1 Then rule.f(rp[1].Trim()) _rules.Insert(key, rule) End Method Method CompileRule:TParseRule(r:TParseNode) If r.term If r.term.tType = "err" Then Return Commit() If r.term.value[0..2] = "%!" Then Return ExpectTT(r.term.value[2..]) If r.term.value[0] = "%"[0] Then Return CheckTT(r.term.value[1..]) Return Self.R(r.term.value) EndIf Select r.rule Case "ralt" Local fst:TParseRule = CompileRule(r.elem[0]) If r.elem[1].elem[0].term r.elem[1] = r.elem[1].elem[1] Return Alt([ fst, CompileRule(r.elem[1]) ]) Else For Local i:Int = 0 Until r.elem[1].elem.Length r.elem[1].elem[i] = r.elem[1].elem[i].elem[1] Next Local rst:TParseRule[] = MapCompile(r.elem[1].elem) rst = rst[..rst.Length + 1] For Local i:Int = rst.Length - 1 To 1 Step -1 rst[i] = rst[i - 1] Next rst[0] = fst Return Alt(rst) EndIf Case "rcat" Return Cat(MapCompile(r.elem)) Case "unar" Local arg:TParseRule = CompileRule(r.elem[0]) Select r.elem[1].term.tType Case "plus" ; Return Plus([arg]) Case "opt" ; Return Opt([arg]) Case "star" ; Return Rep([arg]) End Select End Select End Method Method MapCompile:TParseRule[](a:TParseNode[]) Local ret:TParseRule[] = New TParseRule[a.Length] For Local i:Int = 0 Until a.Length ret[i] = CompileRule(a[i]) Next Return ret End Method Method startRule:TParseRule() Return TParseRule(_rules.ValueForKey(_start)) End Method Method GetNamedRule:TParseRule(r:String) Return TParseRule(_rules.ValueForKey(r)) End Method End Type Private Global MetaLexer:TLexer = TLexer.withRules([.. R("[a-z_][a-z0-9_]*", TLexAction.Store, "key"),.. R("~q[^~q]*~q", TLexAction.Store, "value").. ]) Global BNFLexer:TLexer = TLexer.withRules([.. R("[a-z_][a-z0-9_]*", TLexAction.Store, "name"),.. R("%[a-z_][a-z0-9_]*", TLexAction.Store, "terminal"),.. R("%![a-z_][a-z0-9_]*", TLexAction.Store, "terminal"),.. .. R("\+", TLexAction.Store, "plus"),.. R("\|", TLexAction.Store, "or"),.. R("\?", TLexAction.Store, "opt"),.. R("\*", TLexAction.Store, "star"),.. R("=", TLexAction.Store, "eql"),.. R("!", TLexAction.Store, "err"),.. R("@", TLexAction.Store, "any"),.. R("\(", TLexAction.Store, "lparen"),.. R("\)", TLexAction.Store, "rparen").. ]) Function R:TLexRule(r:String, a(l:TLexer), res:String = "", m:String = "") Return TLexRule.Create(r, a, res, m) End Function Type PMeta Extends TParser Field root:TParseRule, rcat:TParseRule, ralt:TParseRule, unar:TParseRule, atom:TParseRule, nest:TParseRule Method startRule:TParseRule() Return root End Method Method New() root = R("ralt") ralt = Cat([ R("rcat"), Rep([ CheckTT("or"), R("rcat") ]) ]) rcat = Plus([ Alt([ CheckTT("err"), CheckTT("any"), R("unar") ]) ]) unar = Cat([ R("atom"), Opt([ Alt([ CheckTT("plus"), CheckTT("opt"), CheckTT("star") ]) ]) ]) atom = Alt([ CheckTT("name"), CheckTT("terminal"), R("nest") ]) nest = Cat([ CheckTT("lparen"), R("ralt"), CheckTT("rparen") ]).f("- @ - ^") End Method End Type Public ' TParser '========= Type TParseNode Field name:String, rule:String, term:TToken Field elem:TParseNode[] Method n:TParseNode(s:String) name = s ; Return Self End Method Method ToString:String() Function show:String(n:TParseNode, pad:Int) Local id:String = LSet("", pad) If n.name = "#Nil" Then Return id + "[NIL]~n" If n.term Then Return id + n.name + ": { '" + n.term.value + "' : " + n.term.tType + " }~n" Local s:String = id + n.name + ": " + n.rule + "~n" For Local e:TParseNode = EachIn n.elem s :+ show(e, pad + 2) Next Return s End Function Return show(Self, 0) End Method Method GetElem:TParseNode(name:String) For Local e:Int = 0 Until elem.Length If elem[e].name = name Then Return elem[e] Next Return Null End Method Function Leaf:TParseNode(t:ttoken) Local n:TParseNode = New Self ; n.term = t ; Return n End Function Function Node:TParseNode(b:TParseNode[]) Local n:TParseNode = New Self ; n.elem = b ; Return n End Function Function Nil:TParseNode() Local n:TParseNode = New Self ; n.name = "#Nil" ; Return n End Function End Type Type TParser Abstract Method Parse:TParseNode(toks:TToken[]) Self.toks = toks ; ct = 0 ; epstk = Null ; edstk = Null ; rdepth = 0 Local ret:TParseNode = startRule().run(Self) If ret Then ret.name = "@root" Return ret End Method Method Cat:TParseRule(rs:TParseRule[]) Final Return TParseRule.Make("Cat", Self, "", rs) End Method Method Alt:TParseRule(rs:TParseRule[]) Final Return TParseRule.Make("Alt", Self, "", rs) End Method Method Opt:TParseRule(rs:TParseRule[]) Final Return TParseRule.Make("Opt", Self, "", rs) End Method Method Rep:TParseRule(rs:TParseRule[]) Final Return TParseRule.Make("Rep", Self, "", rs) End Method Method Err:TParseRule(msg:String) Final Return TParseRule.Make("Err", Self, msg, Null) End Method Method Plus:TParseRule(rs:TParseRule[]) Final Return TParseRule.Make("Plus", Self, "", rs) End Method Method CheckTT:TParseRule(t:String) Final Return TParseRule.Make("CTT", Self, t, Null) End Method Method CheckVal:TParseRule(v:String) Final Return TParseRule.Make("CV", Self, v, Null) End Method Method ExpectTT:TParseRule(t:String) Final Return TParseRule.Make("ETT", Self, t, Null) End Method Method Commit:TParseRule() Final Return TParseRule.Make("Commit", Self, "", Null) End Method Method Any:TParseRule() Final Return TParseRule.Make("Any", Self, "", Null) End Method Method R:TParseRule(n:String) Final Return TParseRule.Make("Named", Self, n, Null) End Method Field toks:TToken[], ct:Int, epstk:Int[], edstk:Int[], rdepth:Int Method _ctok:TToken() Final If ct < toks.Length Then Return toks[ct] Else Return Null End Method Method _incr:TParseNode() Final Local tok:TToken = toks[ct] ; ct :+ 1 Return TParseNode.Leaf(tok) End Method Method _back(pos:Int) Final If epstk Local ep:Int = epstk[epstk.Length - 1] If pos < ep Then Throw ParseError.Make(toks[ep], "error trying to complete '" + toks[ep - 1].value + "'") EndIf ct = pos End Method Method _popErrs() Final Local c:Int = 0 For Local p:Int = edstk.Length - 1 To 0 Step -1 If edstk[p] = rdepth Then c :+ 1 Else Exit Next If c edstk = edstk[.. edstk.Length - c] epstk = epstk[.. edstk.Length] EndIf rdepth :- 1 End Method Method _pushErr() Final edstk :+ [rdepth] ; epstk :+ [ct] End Method Method GetNamedRule:TParseRule(r:String) Return TParseRule(TTypeId.ForObject(Self).FindField(r).Get(Self)) End Method Method startRule:TParseRule() Abstract End Type Type TParseRule Abstract Field r:String, args:TParseRule[], _f:String[] Method run:TParseNode(p:TParser) Abstract Method f:TParseRule(filt:String) If args.Length = 1 And CatRule(args[0]) <> Null Then args[0].f filt End Method Function Make:TParseRule(subt:String, p:TParser, r:String, args:TParseRule[]) Local t:TParseRule = TParseRule(TTypeId.ForName(subt + "Rule").NewObject()) If args And args.Length > 1 And subt <> "Cat" And subt <> "Alt" args = [p.Cat(args)] EndIf t.r = r ; t.args = args Return t End Function End Type Private Type CatRule Extends TParseRule Method run:TParseNode(p:TParser) Local pos:Int = p.ct, el:TParseNode[] = New TParseNode[args.Length], cct:Int = 0 p.rdepth :+ 1 For Local d:Int = 0 Until args.Length el[d - cct] = args[d].run(p) If el[d - cct] = Null p._back(pos) ; p._popErrs ; Return Null ElseIf el[d - cct] = CommitRule.Nil cct :+ 1 EndIf Next If cct Then el = el[0..(el.Length - cct)] p._popErrs Return filter(el, p) End Method Method filter:TParseNode(el:TParseNode[], p:TParser) Local ls:TList = CreateList(), ct:Int = 0, noFold:Int = False If _f If _f.Length <= el.Length Then Throw ParseError.Make(p._ctok(), "filter pattern is wrong length for rule") For Local e:Int = 0 Until el.Length If el[e] <> OptRule.Nil If _f[e][0] = "@"[0] If _f[e].Length > 1 Then el[e].name = _f[e][1..] Else el[e].name = "@" + ct ls.AddLast el[e] ; ct :+ 1 ElseIf _f[e][0] = "<"[0] Local prev:TParseNode = el[e - 1] If prev <> OptRule.Nil If prev.elem el[e].name = "@" + prev.elem.Length prev.elem = prev.elem + [el[e]] Else el[e].name = "@1" prev = TParseNode.Node([prev, el[e]]) prev.name = prev.elem[0].name ; prev.elem[0].name = "@0" ls.RemoveLast() ; ls.AddLast prev EndIf Else el[e].name = "@0" prev = TParseNode.Node([el[e]]) If _f[e - 1].Length > 1 Then prev.name = _f[e - 1][1..] Else prev.name = "@" + ct ls.AddLast prev ; ct :+ 1 EndIf EndIf EndIf Next If _f[el.Length] <> "^" Then noFold = True Else For Local e:Int = 0 Until el.Length If el[e] <> OptRule.Nil ls.AddLast(el[e]) el[e].name = "@" + ct ; ct :+ 1 EndIf Next EndIf el = TParseNode[](ls.ToArray()) If el = Null Then Return OptRule.Nil If (el.Length > 1) Or noFold Then Return TParseNode.Node(el) Else Return el[0] End Method Method f:TParseRule(filt:String) _f = filt.Split(" ") If _f[_f.Length - 1] <> "^" _f = _f[.._f.Length + 1] ; _f[_f.Length - 1] = "" EndIf Return Self End Method End Type Type AltRule Extends TParseRule Method run:TParseNode(p:TParser) Local pos:Int = p.ct For Local d:TParseRule = EachIn args Local n:TParseNode = d.run(p) ; If n Then Return n p._back(pos) Next Return Null End Method Method f:TParseRule(filt:String) RuntimeError "Cannot apply a filter string to an Alt rule" End Method End Type Type OptRule Extends TParseRule Global Nil:TParseNode = TParseNode.Nil() Method run:TParseNode(p:TParser) Local n:TParseNode = args[0].run(p) If n Then Return n Else Return Nil End Method End Type Type RepRule Extends TParseRule Method run:TParseNode(p:TParser) Return RunF(Self, p) End Method Function RunF:TParseNode(s:TParseRule, p:TParser) Local match:TParseNode, ls:TList Repeat Local pos:Int = p.ct match = s.args[0].run(p) If match = Null Or match = OptRule.Nil p._back(pos) ; Exit Else If ls = Null Then ls = CreateList() ls.AddLast(match) EndIf Forever If ls = Null Return OptRule.Nil Else Local ret:TParseNode[] = TParseNode[](ls.ToArray()) ' If ret.Length > 1 For Local i:Int = 0 Until ret.Length ret[i].name = "@" + i Next Return TParseNode.Node(ret) ' Else ' Return ret[0] ' EndIf EndIf End Function End Type Type ErrRule Extends TParseRule Method run:TParseNode(p:TParser) Throw ParseError.Make(p._ctok(), r) End Method End Type Type PlusRule Extends TParseRule Method run:TParseNode(p:TParser) Local ret:TParseNode = RepRule.RunF(Self, p) If ret = OptRule.Nil Then Return Null Else Return ret End Method End Type Type CTTRule Extends TParseRule Method run:TParseNode(p:TParser) If p._ctok() And p._ctok().tType = r Then Return p._incr() Else Return Null End Method End Type Type CVRule Extends TParseRule Method run:TParseNode(p:TParser) If p._ctok() And p._ctok().value = r Then Return p._incr() Else Return Null End Method End Type Type ETTRule Extends TParseRule Method run:TParseNode(p:TParser) If p._ctok() And p._ctok().tType = r Then Return p._incr() Else Throw ParseError.Gen(p._ctok(), r) End Method End Type Type CommitRule Extends TParseRule Global Nil:TParseNode = TParseNode.Nil() Method run:TParseNode(p:TParser) p._pushErr ; Return Nil End Method End Type Type AnyRule Extends TParseRule Method run:TParseNode(p:TParser) If p._ctok() Then Return p._incr() Else Return Null End Method End Type Type NamedRule Extends TParseRule Method run:TParseNode(p:TParser) Local ret:TParseNode = p.GetNamedRule(r).run(p) If ret And ret.rule = "" Then ret.rule = r Return ret End Method End Type Type ParseError Field msg:String Method ToString:String() Return "TParser: " + msg End Method Function Make:ParseError(t:TToken, msg:String = "") Local e:ParseError = New Self e.msg = "error in " + t.file + " at line " + t.l + ", col " + t.c + ": " + msg Return e End Function Function Gen:ParseError(t:TToken, ex:String) Return Make(t, "expecting {" + ex + "} but found {" + t.tType + "}") End Function End Type Public ' TLexer '======== Type TToken Field value:String, tType:String Field file:String, l:Int, c:Int Function Make:TToken(v:String, ty:String, f:String, l:Int, c:Int) Local t:TToken = New TToken t.value = v ; t.tType = ty ; t.file = f ; t.l = l ; t.c = c Return t End Function End Type Type TLexer Field rules:TLexRule[] Field cFile:LexFile Field out:TList, matchR:TLexRule, matchS:String Field csMode:Int, guardMode:Int, mode:String Field istk:TList, prev:TList Function withRules:TLexer(r:TLexRule[]) Local l:TLexer = New TLexer l.rules = r l.Reset Return l End Function Method SetCaseSensitivity(cs:Int) csMode = cs End Method Method SetGuardMode(gm:Int) guardMode = gm End Method Method Reset() istk = CreateList() ; prev = CreateList() ; out = Null ; cFile = Null ; mode = "" End Method Method ScanFile:TToken[](name:String) Return TToken[](ScanLFile(Self, LexFile.fromFile(name)).ToArray()) End Method Method ScanString:TToken[](s:String) Return TToken[](ScanLFile(Self, LexFile.fromString(s)).ToArray()) End Method Method GuardFileName(f:LexFile) If guardMode Then prev.AddLast(f.dir + f.name) End Method End Type Type TLexRule Field rule:TRegEx, pattern:String Field action(l:TLexer), result:String, mode:String Function Create:TLexRule(rs:String, act(l:TLexer), result:String = "", mode:String = "") Local r:TLexRule = New TLexRule r.rule = TRegEx.Create("\G" + rs, Null) ; r.pattern = rs If result = "" And act = TLexAction.Store Then result = rs r.action = act ; r.result = result ; r.mode = mode Return r End Function End Type Type TLexAction Function Store(l:TLexer) Local fname:String = l.cFile.name ; If fname <> "<string>" Then fname :+ " [" + l.cFile.dir + fname + "]" l.out.AddLast TToken.Make(l.matchS, l.matchR.result, fname, l.cFile.cLine, l.cFile.cCol) End Function Function Mode(l:TLexer) l.mode = l.matchR.result End Function Function Error(l:TLexer) Throw LexError.Make(l, l.matchR.result) End Function Function Discard(l:TLexer) End Function Function Incl(l:TLexer) l.matchS = FilterIncludeString(l.matchS, l.matchR.result) 'Shorten the token to just the file path TryIncludeFile(l, l.matchS) End Function End Type Private Type LexFile Field dir:String, name:String Field stream:String, sLen:Int, cPtr:Int, cLine:Int, cCol:Int Function fromFile:LexFile(name:String) name = RealPath(name) Local f:LexFile = New LexFile f.name = StripDir(name) ; f.dir = ExtractDir(name) + "/" f.stream = LoadText(name) ; f.sLen = Len(f.stream) ; f.cLine = 1 ; f.cCol = 1 Return f End Function Function fromString:LexFile(s:String) Local f:LexFile = New LexFile f.name = "<string>" ; f.dir = "" f.stream = s ; f.sLen = Len(s) ; f.cLine = 1 ; f.cCol = 1 Return f End Function Method Increment(count:Int) For Local c:Int = 1 To count If stream[cPtr] < 32 'Only count printable characters in the column field If stream[cPtr] = 10 Then cLine :+ 1 ; cCol = 1 Else cCol :+ 1 EndIf cPtr :+ 1 Next End Method End Type Type LexError Field msg:String Function Make:LexError(l:TLexer, msg:String) Local e:LexError = New LexError, fname:String = l.cFile.name ; If fname <> "<string>" Then fname :+ " [" + l.cFile.dir + fname + "]" e.msg = "error in " + fname + " at line " + l.cFile.cLine + ", col " + l.cFile.cCol + ": " If msg <> "" Then e.msg :+ msg Else e.msg :+ "unexpected character '" + l.cFile.stream[l.cFile.cPtr] + "'" Return e End Function Method ToString:String() Return "TLexer: " + msg End Method End Type Function ScanLFile:TList(l:TLexer, f:LexFile) l.cFile = f l.istk.AddLast f l.GuardFileName f TRegEx.options = New TRegExOptions TRegEx.options.targetIsMultiline = False TRegEx.options.caseSensitive = l.csMode l.mode = "" l.out = CreateList() Repeat Local token:String, rule:TLexRule While l.cFile.cPtr < l.cFile.sLen Local cf:LexFile = l.cFile token = "" ; rule = Null For Local r:TLexRule = EachIn l.rules If l.mode = r.mode Local cMatch:TRegExMatch = r.rule.Find(cf.stream, cf.cPtr) If cMatch If Len(cMatch.SubExp()) > Len(token) Then token = cMatch.SubExp() ; rule = r EndIf EndIf Next If rule 'Something matched successfully! l.matchS = token ; l.matchR = rule rule.action(l) cf.Increment Len(token) Else cf.Increment 1 EndIf l.matchR = Null ; l.matchS = "" Wend If l.cFile <> f 'Pop back to the previous file in the Include l.istk.RemoveLast l.cFile = LexFile(l.istk.Last()) Else Exit 'If it's f, we're done EndIf Forever Local out:TList = l.out l.Reset Return out End Function ' Use a simple set of filter chars to chop the path out of an Include directive Function FilterIncludeString:String(inc:String, filter:String) For Local i:Int = 1 To Len(filter) Local p:Int = Instr(inc, Mid(filter, i, 1)) If p Then inc = Mid(inc, p + 1) ; Exit Next For Local i:Int = 1 To Len(filter) Local p:Int = Instr(inc, Mid(filter, i, 1)) If p Then inc = Left(inc, p - 1) ; Exit Next Return inc End Function ' Try to include a source file, guards and recursion checks permitting Function TryIncludeFile(l:TLexer, file:String) file = RealPath(file) If l.guardMode 'Auto-guarded includes: check if it's been used already, if so ignore it For Local name:String = EachIn l.prev If name = file Then Return '...return without actually changing it Next EndIf For Local f:LexFile = EachIn l.istk 'Check against the currently-open files If f.dir + f.name = file Throw LexError.Make(l, "Cannot recursively include '" + file) EndIf Next l.cFile = LexFile.fromFile(file) l.istk.AddLast l.cFile ; l.GuardFileName l.cFile End Function Public |
Comments
| ||
Any reason grammar is of type TMap? It doesn't appear to get initialize |
| ||
Example seems to work fine here with no changes needed. Executing:Example 1 +2 *(3 + 1)<< 4 + 5 @root: SumExpr L: { '1' : number } R: @0: @0: { '+' : add } @1: ShiftExpr L: MulExpr L: { '2' : number } R: @0: { '*' : mul } @1: Atom @0: { '(' : lparen } @1: SumExpr L: { '3' : number } R: @0: { '+' : add } @1: { '1' : number } @2: { ')' : rparen } R: @0: { '<<' : shl } @1: { '4' : number } @1: @0: { '+' : add } @1: { '5' : number } Process complete I would try Import BRL.Map. |
| ||
Yeah there doesn't seem to be much reason... the field has to be called "grammar" so that the constructors know where to look for the metadata block containing the language definition, but it isn't actually used for anything at runtime. Literally all it does is mark that metadata definition of the grammar to load up. I guess TMap makes more sense to look at than e.g. Int, since it kind-of represents a keyed association data type? But if the field is unused, its actual type is not really that important (reflection doesn't care about variable types). |
| ||
@munch, I never had any problems with it running, I just assumed Parse might have used it to store data. After noticing it Null in the debugger and reading through the code better, only it's metadata is used. ** I'm having trouble with Parse.Rep() ignoring filters. Cat([ R("Expr"), Rep([ CheckTT("comma") , R("Expr") ]).f("- @") ]) If I wrap the contents with Parse.Cat(), I can use the filter on that Cat([ R("Expr"), Rep([ Cat([ CheckTT("comma") , R("Expr") ]).f("- @") ]) ]) Is there also an example of using "<"? None of the examples seem to use it. |
| ||
Ah, this version is slightly simplified from the Blitz3D one and doesn't actually run filters on rules other than Cat. Sorry about that. Largely this is because all of the other rules can return a list of variable length, or nothing, which can't always match a static filter string. And because TBH I wasn't expecting anyone to use the non-Meta parser form anyway! Although you're welcome to if you prefer it. (Also: I figured that since filters are a convenience - you can always ignore them and process the tree manually later - full support isn't terribly important. In practice, for clarity they should really only go on toplevel named rules anyway.) Should note that the "drop" operator in filter strings was originally supposed to be a tilde, not a minus - the minus happens to work only because of lazy coding not checking for it (it currently just tests "is it a @name? is it a <? else drop"). I... won't correct this for now, since you're using -, but feel like I should mention that background. The < operator is used to shove a match onto the end of a preceding list. It's potentially useful for situations like "(Expr %comma)* Expr" - it's supposed to shove that last odd element onto the list with all of the others (assuming you filtered out the commas) to make the list completely homogenous, which presumably matches its intent in the language you're parsing (e.g. the list of arguments to a function should conceptually be "the list", not "most of the list and also that last one"). |
| ||
The reason for using TParser is the sub-rule filter, in the example I'm using it to remove commas in a list. As for the minus drop operator, the code for PMeta gave the impression it was minus. I figured since ~ (tilde) is Blitzmax's escape character in string literals (but not metadata), the use of minus was deliberate. Thanks for updating and explaining more of the parser. |
| ||
Yep you're absolutely right on both counts, guess I forgot about that not being an oversight at all. Presumably the tilde only works at all because some language quirk is leaving my metadata strings unescaped. I should move away from using it too! |
| ||
OK, I think I have updated it so that filters can now be applied to the Opt/Rep/Plus rules as well, where appropriate. Applying filters to Alt is also now explicitly disallowed because that can't work properly. A note on the < operator: using it will always make the preceding Rep or Plus match into a list item, even if it only contains one element (which would naturally have to be the one appended by <), and would normally be eligible for upfolding. This is intentional and based on the idea that if you're using it, you'll be expecting a list to iterate over in that slot, so the slot should always deliver a list. |
Code Archives Forum