Description
This is a umbrella jira for Pig on Tez. More detailed subtasks will be added.
More information can be found on the following wiki page:
https://cwiki.apache.org/confluence/display/PIG/Pig+on+Tez
How to set up your development environment-
- Check out Tez trunk.
- Install protobuf 2.5.0.
- Build Tez with Hadoop 2.2.0.(By default, it builds against Hadoop trunk, which is 3.0.0.)
- Install Tez jars on local maven repository with "mvn install -DskipTests".
- Check out Pig Tez branch.
- Build Pig running "ant jar-withouthadoop".
- Set up a single-node (or multi-node) Hadoop 2.2 cluster.
- Install Tez following the instructions on the Tez homepage.
- Run Pig with "-x tez" option.
How to run Tez tests-
- unit test
ant test-tez
By default, exectype is tez, and hadoopversion is 23 in tez branch. But you can run unit tests in mr mode as follows:
ant test -Dexectype=mr -Dhadoopversion=20
- e2e tests
ant -Dharness.old.pig=$PIG_HOME -Dharness.hadoop.home=$HADOOP_HOME -Dharness.cluster.conf=$HADOOP_CONF -Dharness.cluster.bin=$HADOOP_BIN test-e2e-tez -Dhadoopversion=23
Attachments
Issue Links
- relates to
-
PIG-3890 Global sort is not working (order by) Pig over Tez
- Resolved
-
PIG-3839 Umbrella jira for Pig on Tez Performance Improvements
- Open
-
PIG-4059 Pig on Spark
- Closed
-
PIG-3840 Umbrella jira for Pig on Tez Unit Test porting
- Closed
- requires
-
PIG-3520 Provide backward compatibility for PigRunner and PPNL after PIG-3419
- Closed
-
PIG-3525 PigStats.get() and ScriptState.get() shouldn't return MR-specific objects
- Closed
-
PIG-3591 Refactor POPackage to separate MR specific code from packaging
- Closed
-
PIG-3508 'explain' now showing logical plan BEFORE the necessary optimization (ImplicitSplitInserter, DuplicateForEachColumnRewrite,etc)
- Closed
-
PIG-3419 Pluggable Execution Engine
- Closed
-
TEZ-591 Provide mode specific diagnostic information to the Tez client
- Closed
- supercedes
-
PIG-1734 Pig needs a more efficient DAG execution
- Resolved