I started a prototype here:
https://github.com/julienledem/pig/compare/trunk...compile_physical_plan
The current physical plan is relatively inefficient at evaluating expressions.
In the context of a better execution engine (Tez, Spark, ...), compiling expressions to bytecode would be a significant speedup.
This is a candidate project for Google summer of code 2014. More information about the program can be found at https://cwiki.apache.org/confluence/display/PIG/GSoc2014
1.
|
Bytecode generation for POFilter and POForeach |
|
Patch Available | Rohini Palaniswamy |
2.
|
Bytecode generation for EvalFunc |
|
Open | Rohini Palaniswamy |
3.
|
Bytecode generation for LoadFunc and StoreFunc |
|
Open | Rohini Palaniswamy |
4.
|
Add line numbers to bytecode for debugging |
|
Open | Rohini Palaniswamy |
5.
|
Bytecode generation optimizer in MR and Spark mode |
|
Open | Unassigned |
6.
|
Fix bytecode generation for parallel runs of python |
|
Open | Unassigned |
7.
|
Add support for accumulator, cross and combiner plans |
|
Open | Unassigned |