Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4059 Pig on Spark
  3. PIG-4191

Make skewed join work with Spark

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • spark-branch
    • spark
    • None

    Description

      Related e2e tests: Union_9, Union_10, SkewedJoin_1 - SkewedJoin_10

      Sample script:
      a = load '/user/pig/tests/data/singlefile/studenttab10k' as (name, age, gpa);
      b = load '/user/pig/tests/data/singlefile/studentcolon10k' using PigStorage(':') as (name, age, gpa);
      c = union a, b;
      d = load '/user/pig/tests/data/singlefile/votertab10k' as (name, age, registration, contributions);
      e = join c by name, d by name using 'skewed' PARALLEL 5;
      store e into '/user/pig/out/praveenr-1411380943-nightly.conf/Union_9.out';

      Pig Stack Trace
      ---------------
      ERROR 0: java.lang.IllegalArgumentException: Spork unsupported PhysicalOperator: (Name: e: SkewedJoin[tuple] - scope-6 Operator Key: scope-6)

      org.apache.pig.backend.executionengine.ExecException: ERROR 0: java.lang.IllegalArgumentException: Spork unsupported PhysicalOperator: (Name: e: SkewedJoin[tuple] - scope-6 Operator Key: scope-6)
      at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:285)
      at org.apache.pig.PigServer.launchPlan(PigServer.java:1378)
      at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1363)
      at org.apache.pig.PigServer.execute(PigServer.java:1352)
      at org.apache.pig.PigServer.executeBatch(PigServer.java:403)
      at org.apache.pig.PigServer.executeBatch(PigServer.java:386)
      at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:170)
      at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:233)
      at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:204)
      at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
      at org.apache.pig.Main.run(Main.java:611)
      at org.apache.pig.Main.main(Main.java:164)
      Caused by: java.lang.IllegalArgumentException: Spork unsupported PhysicalOperator: (Name: e: SkewedJoin[tuple] - scope-6 Operator Key: scope-6)
      at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.physicalToRDD(SparkLauncher.java:239)
      at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.physicalToRDD(SparkLauncher.java:232)
      at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:140)
      at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:279)
      ... 11 more
      ================================================================================

      Attachments

        1. PIG-4191-1.patch
          11 kB
          Praveen Rachabattuni

        Activity

          People

            praveenr019 Praveen Rachabattuni
            praveenr019 Praveen Rachabattuni
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: