I have modified pig grunt code to use ANTLR for a some grunt commands (CAT, HELP and QUIT). I have attached the diff file for your review. Please find more details about the changes below.
I have the basic code working, but I still think it is just the first draft. I would be refining and cleaning code as I proceed further. but before I do that, I want to make sure that I am heading in the right direction. Can you please take a look at the code and let me know if you see any issues with my approach?
Enhanced existing grammar: Instead of creating new grammar as I suggested earlier, I ended up modifying existing grammars to add grunt commands. i.e. I have modified Query
.g, ASTValidator.g and LogicalPlanGenerator.g to support these commands. After trying various approaches including new grammer, enhanced existing grammar with changes in PigServer to support grunt commands etc. I think this is the cleanest approach. You had also suggested this as the preferred option as well.
Deprecated GruntParser: I have depcrecated GruntParser. To replace that, I have created a new class 'GruntDriver'. Grunt.java now uses this new class instead.
GruntDriver works in interactive as well as batch mode.
GruntDriver.process method is similar to what GruntParser.parseStopOnError() does.
process method first uses the grammar to parse the input stream (parsing code is identical to QueryParserDriver) and creates the tree.
process method then traverses the tree: every time it comes across a grunt command's node, it executes it immediately. For all pig query nodes, GruntDriver delegates the work to PigServer by calling its registerQuery method.
Retain the original input text:
One caveat I encountered was that PigServer.registerQuery expects raw pig query string as input. Whereas, after AST generation, GruntDriver does not have the raw input anymore. I did consider modifying PigServer code to see if it can take the tree as input. But that change seemed way to intrusive. and also since PigServer is public interface, I do not feel comfortable it having an API that takes AST node.
so, instead I modified grammar such that it retains the original input string as one of the children for all statement. for example general_statement in QueryParser.g now has an additional child TEXT[$general_statement.text]. this child value is then used by GruntDriver to pass the original input to PigServer.registerQuery.
Add all commands: I have added only some commands in GruntDriver. I am working on adding many more at this time. I expect many of them to be trivial to add such as cd, cp etc. And some would require more work such as explain, run and exec.
Secondary Prompt: With this new implementation, the secondary prompt in interactive mode does not work. i.e. existing pig gives a different kind of prompt (">>") if the statement provided through the grunt shell is incomplete. with my changes, it gives the error saying that input was invalid. I am not sure how critical it is to support such secondary prompts. I have a few ideas about how to support it, but I believe it requires lot of efforts and code changes in the grammar. So, before I start on that, I just want to understand how critical it is to retain that feature.