Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4322

Enabling SchemaTuple Feature Results in failed jobs

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 0.12.0
    • Fix Version/s: None
    • Component/s: grunt, tools
    • Labels:
      None
    • Environment:

      Amazon AWS Elastic Mapreduce AMI 3.2.1

      • Amazon 2.4.0
      • Pig 0.12.0
      • 1x m1.large Master, 40x m1.large Core, 20x m1.large Task

      Description

      This is the stack trace that causes my jobs to fail:

      Error: java.io.FileNotFoundException: SchemaTuple_21$1.class (No such file or directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.<init>(FileInputStream.java:146) at org.apache.pig.data.SchemaTupleBackend.copyAllFromDistributedCache(SchemaTupleBackend.java:187) at org.apache.pig.data.SchemaTupleBackend.copyAndResolve(SchemaTupleBackend.java:160) at org.apache.pig.data.SchemaTupleBackend.initialize(SchemaTupleBackend.java:278) at org.apache.pig.data.SchemaTupleBackend.initialize(SchemaTupleBackend.java:268) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:174) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:775) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
      

      Here is a grep of the pig logs that refer to SchemaTuple_21:

      2014-11-11 11:50:00,275 [main] INFO  org.apache.pig.data.SchemaTupleClassGenerator - Compiling class SchemaTuple_21 for Schema: {{(long,long,int,datetime,long,chararray)}}, and appendability: false
      2014-11-11 11:50:00,514 [main] INFO  org.apache.pig.data.SchemaTupleClassGenerator - Successfully compiled class: SchemaTuple_21
      2014-11-11 11:50:02,470 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - File successfully added to the distributed cache: SchemaTuple_21.class
      2014-11-11 11:50:02,551 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - File successfully added to the distributed cache: SchemaTuple_21$1.class
      2014-11-11 11:50:07,378 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize [SchemaTuple_16.class,SchemaTuple_12.class,SchemaTuple_2.class,SchemaTuple_51.class,SchemaTuple_20.class,SchemaTuple_21.class,SchemaTuple_26.class,SchemaTuple_39.class,SchemaTuple_21$1.class,SchemaTuple_19.class,SchemaTuple_53.class,SchemaTuple_10.class,SchemaTuple_27.class,SchemaTuple_25$1.class,SchemaTuple_5.class,SchemaTuple_35$1.class,SchemaTuple_45.class,SchemaTuple_32$1.class,SchemaTuple_50.class,SchemaTuple_33.class,SchemaTuple_64.class,SchemaTuple_54$1.class,SchemaTuple_57.class,SchemaTuple_57$1.class,SchemaTuple_47$1.class,SchemaTuple_35.class,SchemaTuple_7$1.class,SchemaTuple_56.class,SchemaTuple_29.class,SchemaTuple_52$1.class,SchemaTuple_40$1.class,SchemaTuple_55$1.class,SchemaTuple_48$1.class,SchemaTuple_61$1.class,SchemaTuple_0.class,SchemaTuple_46$1.class,SchemaTuple_2$1.class,SchemaTuple_3.class,SchemaTuple_15$1.class,SchemaTuple_28$1.class,SchemaTuple_49$1.class,SchemaTuple_16$1.class,SchemaTuple_60.class,SchemaTuple_7.class,SchemaTuple_9.class,SchemaTuple_44$1.class,SchemaTuple_11$1.class,SchemaTuple_52.class,SchemaTuple_1$1.class,SchemaTuple_13$1.class,SchemaTuple_19$1.class,SchemaTuple_9$1.class,SchemaTuple_56$1.class,SchemaTuple_17$1.class,SchemaTuple_72$1.class,SchemaTuple_25.class,SchemaTuple_55.class,SchemaTuple_30.class,SchemaTuple_69.class,SchemaTuple_62$1.class,SchemaTuple_71$1.class,SchemaTuple_41.class,SchemaTuple_68$1.class,SchemaTuple_72.class,SchemaTuple_49.class,SchemaTuple_26$1.class,SchemaTuple_69$1.class,SchemaTuple_3$1.class,SchemaTuple_65$1.class,SchemaTuple_61.class,SchemaTuple_30$1.class,SchemaTuple_59$1.class,SchemaTuple_66$1.class,SchemaTuple_20$1.class,SchemaTuple_53$1.class,SchemaTuple_24.class,SchemaTuple_70.class,SchemaTuple_66.class,SchemaTuple_60$1.class,SchemaTuple_42.class,SchemaTuple_59.class,SchemaTuple_40.class,SchemaTuple_47.class,SchemaTuple_63.class,SchemaTuple_67.class,SchemaTuple_36$1.class,SchemaTuple_50$1.class,SchemaTuple_71.class,SchemaTuple_38$1.class,SchemaTuple_58$1.class,SchemaTuple_51$1.class,SchemaTuple_41$1.class,SchemaTuple_64$1.class,SchemaTuple_58.class,SchemaTuple_43.class,SchemaTuple_44.class,SchemaTuple_28.class,SchemaTuple_13.class,SchemaTuple_63$1.class,SchemaTuple_29$1.class,SchemaTuple_37.class,SchemaTuple_37$1.class,SchemaTuple_6.class,SchemaTuple_31.class,SchemaTuple_4$1.class,SchemaTuple_68.class,SchemaTuple_14.class,SchemaTuple_32.class,SchemaTuple_14$1.class,SchemaTuple_62.class,SchemaTuple_18$1.class,SchemaTuple_65.class,SchemaTuple_38.class,SchemaTuple_42$1.class,SchemaTuple_33$1.class,SchemaTuple_4.class,SchemaTuple_34$1.class,SchemaTuple_23$1.class,SchemaTuple_34.class,SchemaTuple_6$1.class,SchemaTuple_1.class,SchemaTuple_39$1.class,SchemaTuple_23.class,SchemaTuple_12$1.class,SchemaTuple_17.class,SchemaTuple_8$1.class,SchemaTuple_10$1.class,SchemaTuple_31$1.class,SchemaTuple_67$1.class,SchemaTuple_11.class,SchemaTuple_22.class,SchemaTuple_45$1.class,SchemaTuple_15.class,SchemaTuple_0$1.class,SchemaTuple_24$1.class,SchemaTuple_36.class,SchemaTuple_43$1.class,SchemaTuple_18.class,SchemaTuple_70$1.class,SchemaTuple_46.class,SchemaTuple_54.class,SchemaTuple_22$1.class,SchemaTuple_5$1.class,SchemaTuple_27$1.class,SchemaTuple_48.class,SchemaTuple_8.class]
      

      I can't find anywhere in my script (~1k lines) any way in which this schema would present itself: (long,long,int,datetime,long,chararray) as referred above, to include within a nested foreach, after a join, or as part of a group by, but i do use all of those data types. Besides the stack thrown by the maps which causes everything to fail, the pig logging looks good, so I don't know what else I can provide that would help.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              randywallace Randy Wallace
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: