Friday, May 11, 2007

TAS Must Die, Chapter 24

Again not a lot has happened. I need to hit the lottery and become a professional hobbiest.

Most of my work lately has been improving the operator precedence parser (OPP) as I implement more of Iscript's command set. Not surprisingly, most of the mojo is in the expressions, not the rest of the syntax.

I don't like the feature of OPPs is that they provide no mechanism for robust syntax checking. Or at least the 'student' version I started with doesn't. The problem is that I implemented a stack for operators and a stack for IDs. Easy to use but it allows the following:

( 3 2 + )

When the ) is scanned, it will cause the + to be reduced. It works but isn't syntactically correct. In reality, '2' never should have been able to follow '3'. This could probably be solved by using a single stack for IDs and operators at some loss of elegance. You'd have to implement a "can this follow that" sort of routine. The stack itself would handle matching parentheses. The action table would handle "is this token valid at all" checking.

The other thing that I believe will come back to haunt me is that I push Tokens onto the ID stack. This isn't wrong in itself, but a Token is what comes back from a lexer. By the time I'm parsing it, I suspect I need something more high-powered than a Token which isn't a lot more than a name and a general data type. This became evident when I started dealing with associative arrays. Since all I have to work with in a Token is a textual name, I end up representing v1[v2] as a variable called "v1@v2". This will cause me to have to dismantle this in the Executable which is ridiculous. I can use "v1@v2" as the variable name, but under the hood I should keep track of the discreet parts (v1, v2, and their relationship) so I don't have to re-parse them later. Basically, instead of pushing Tokens onto the ID stack, I should probably be pushing Pcodes! I'll have to investigate this.

No comments:

Post a Comment