Details
Description
Pig-on-Tez supports an edge configuration using a sampled Output along with a vertex manager for automatic parallelism estimation.
This is referred to in the Pig-on-Tez Hadoop Summit presentation.
http://www.slideshare.net/Hadoop_Summit/pig-on-tez-low-latency-etl-with-big-data/19
Migrating that plan-model into Tez as a native edge type would allow for much more efficient scheduling of the downstream edges and effectively turn the auto-parallelism implementation into a runtime skew-correcting mechanism within this edge.
The Tez Edge has enough information to sample, determine partitioning order and correct parallelism.