Friday, April 20, 2007

TAS Must Die, Chapter 15b

I implemented an operator precedence parser in a more object-oriented way. I am not sure if it's actually better than the common C implementation.

It currently doesn't have the mojo for IScript's syntax. But I could get there simply by defining precedence rules. It would probably take me about 20 minutes.

I ended up with 5 new notable classes and interfaces:

1) A ParserAction interface, instances of which perform shift, accept, reduce, and error processing. A traditional OPP includes a table of actions which are scalars. The scalars are used in some manner of switch or if/else construct to find the code snippet to execute. A more hardcore C implementation would use pointer-to-function. In my OO implementation, the table includes action class to execute. This is analogous to C's pointer-to-function construct. I currently have 4 implementations of this interface. When the final parser is all done, it would be easy to have 15.

2) A map of input-Token-to-action. The key would be the input Token. The value would be an instance of ParserAction.

3) A Map of Token-to-#2. The key is a Token (taken from the top of the operator stack.) The value is the map described in #2 above. When the ParserAction says to shift, I use the input Token to find the appropriate #2 from this map, and I push it onto the operator stack.

4) A Stack of 'operators'. This is the biggest departure from the typical 'student' OOP implementation. In the trivial case, you push operator characters onto the stack. When it comes time to pop the stack and do something, you do a lookup to determine its index into the precedence array. Instead of pushing the operator character (like a "+") or more OO Token, I push the entire map from #2 above. Now when it is time to decide how to treat an input Token that's an operator, I peek() the map on the stack, do a get() from it using the input Token as the key, and invoke the resulting ParserAction. There are no extraneous lookups in this implementation.

5) A Stack of IDs. When we scan an ID, push it onto this stack. When we reduce, we'll pull from this stack and sometimes push back a temp variable.

The code works well.

I didn't get the elegant lean code I was looking for. Perhaps I expect too much. I do have a very nice parser that doesn't feature some god-awful if/else construct or switch statement that's as long as your arm. The actual core of the implementation is perhaps 8 lines of code. The rest includes a lot of setters and getters and other 'non-breaking' code.

No comments:

Post a Comment