Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-976

Multi-query optimization throws ClassCastException

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.4.0
    • 0.6.0
    • impl
    • None
    • Reviewed

    Description

      Multi-query optimization fails to merge 2 branches when 1 is a result of Group By ALL and another is a result of Group By field1 where field 1 is of type long. Here is the script that fails with multi-query on.

      data = LOAD 'test' USING PigStorage('\t') AS (a:long, b:double, c:double);
      A = GROUP data ALL;
      B = FOREACH A GENERATE SUM(data.b) AS sum1, SUM(data.c) AS sum2;
      C = FOREACH B GENERATE (sum1/sum2) AS rate;
      STORE C INTO 'result1';

      D = GROUP data BY a;
      E = FOREACH D GENERATE group AS a, SUM(data.b), SUM(data.c);
      STORE E into 'result2';

      Here is the exception from the logs

      java.lang.ClassCastException: org.apache.pig.data.DefaultTuple cannot be cast to org.apache.pig.data.DataBag
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:399)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:180)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:145)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:197)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:235)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:240)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:264)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:254)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.processOnePackageOutput(PigCombiner.java:196)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:174)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:63)
      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.combineAndSpill(MapTask.java:906)
      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:786)
      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:698)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:228)
      at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2206)

      Attachments

        1. PIG-976.patch
          9 kB
          Richard Ding
        2. PIG-976.patch
          14 kB
          Richard Ding
        3. PIG-976.patch
          24 kB
          Richard Ding
        4. PIG-976.patch
          25 kB
          Richard Ding
        5. PIG-976.patch
          25 kB
          Richard Ding
        6. PIG-976_1.patch
          0.9 kB
          Richard Ding

        Activity

          People

            rding Richard Ding
            ankur Ankur Bansal
            Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: