Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4059 Pig on Spark
  3. PIG-4191

Make skewed join work with Spark

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: spark-branch
    • Component/s: spark
    • Labels:
      None

      Description

      Related e2e tests: Union_9, Union_10, SkewedJoin_1 - SkewedJoin_10

      Sample script:
      a = load '/user/pig/tests/data/singlefile/studenttab10k' as (name, age, gpa);
      b = load '/user/pig/tests/data/singlefile/studentcolon10k' using PigStorage(':') as (name, age, gpa);
      c = union a, b;
      d = load '/user/pig/tests/data/singlefile/votertab10k' as (name, age, registration, contributions);
      e = join c by name, d by name using 'skewed' PARALLEL 5;
      store e into '/user/pig/out/praveenr-1411380943-nightly.conf/Union_9.out';

      Pig Stack Trace
      ---------------
      ERROR 0: java.lang.IllegalArgumentException: Spork unsupported PhysicalOperator: (Name: e: SkewedJoin[tuple] - scope-6 Operator Key: scope-6)

      org.apache.pig.backend.executionengine.ExecException: ERROR 0: java.lang.IllegalArgumentException: Spork unsupported PhysicalOperator: (Name: e: SkewedJoin[tuple] - scope-6 Operator Key: scope-6)
      at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:285)
      at org.apache.pig.PigServer.launchPlan(PigServer.java:1378)
      at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1363)
      at org.apache.pig.PigServer.execute(PigServer.java:1352)
      at org.apache.pig.PigServer.executeBatch(PigServer.java:403)
      at org.apache.pig.PigServer.executeBatch(PigServer.java:386)
      at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:170)
      at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:233)
      at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:204)
      at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
      at org.apache.pig.Main.run(Main.java:611)
      at org.apache.pig.Main.main(Main.java:164)
      Caused by: java.lang.IllegalArgumentException: Spork unsupported PhysicalOperator: (Name: e: SkewedJoin[tuple] - scope-6 Operator Key: scope-6)
      at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.physicalToRDD(SparkLauncher.java:239)
      at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.physicalToRDD(SparkLauncher.java:232)
      at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:140)
      at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:279)
      ... 11 more
      ================================================================================

        Attachments

        1. PIG-4191-1.patch
          11 kB
          Praveen Rachabattuni

          Activity

            People

            • Assignee:
              praveenr019 Praveen Rachabattuni
              Reporter:
              praveenr019 Praveen Rachabattuni
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: