Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-1830

Type mismatch error in key from map, when doing GROUP on PigStorageSchema() variable

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.8.1
    • Component/s: None
    • Labels:
      None

      Description

      Pig fails when we try to GROUP data loaded via PigStorageSchema.

      Events = LOAD 'input/PigStorageSchema' USING org.apache.pig.piggybank.storage.PigStorageSchema();
      
      Sessions = GROUP Events BY name;
      
      DUMP Sessions;
      

      Schema file '''input/PigStorageSchema/.pig_schema'''

      {"fields":[{"name":"name","type":55,"schema":null,"description":"autogenerated from Pig Field Schema"},{"name":"val","type":10,"schema":null,"description":"autogenerated from Pig Field Schema"}],"version":0,"sortKeys":[],"sortKeyOrders":[]}
      

      Header file '''input/PigStorageSchema/.pig_header'''

      name    val
      

      Sample input file '''input/PigStorageSchema/pss.in'''

      peter   1
      samir   3
      michael 4
      peter   2
      peter   4
      samir   1
      

      On running the above pig script, the following error is received.

      2010-12-15 08:07:58,367 WARN org.apache.hadoop.mapred.Child: Error running child
      java.io.IOException: Type mismatch in key from map: expected org.apache.pig.impl.io.NullableText, recieved
      org.apache.pig.impl.io.NullableBytesWritable
              at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:898)
              at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:600)
              at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
              at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116)
              at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238)
              at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231)
              at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
              at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
              at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:674)
              at org.apache.hadoop.mapred.MapTask.run(MapTask.java:335)
              at org.apache.hadoop.mapred.Child$4.run(Child.java:242)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:396)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062)
              at org.apache.hadoop.mapred.Child.main(Child.java:236)
      

      On changing "type" of "name" from 55(chararray) to 50(bytearray), the
      GROUP-BY worked.

        Attachments

        1. 1830-testpatch.tgz
          230 kB
          Alan Gates
        2. PIG_1830.patch
          20 kB
          Dmitriy V. Ryaboy

          Activity

            People

            • Assignee:
              dvryaboy Dmitriy V. Ryaboy
              Reporter:
              miteshsjat Mitesh Singh Jat
            • Votes:
              2 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: