Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-3661

Piggybank AvroStorage fails if used in more than one load or store statement

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.11.1
    • 0.12.1, 0.13.0
    • None
    • None
    • Reviewed

    Description

      To reproduce:
      A =load '/tmp/data' as (a1:int, a2:int, a3:int);
      B = load '/tmp/data1' as (b1:chararray, b2:chararray, b3:chararray);
      store A into '/tmp/out/a' using org.apache.pig.piggybank.storage.avro.AvroStorage();
      store B into '/tmp/out2/b' using org.apache.pig.piggybank.storage.avro.AvroStorage();

      It either fails in the map job if schema is incompatible, or B gets schema of A and B merged leading to incorrect results.

      Reason is schema is stored and accessed from a property of UDFContext without using a context signature.

      UDFContext context = UDFContext.getUDFContext();
      Properties property = context.getUDFProperties(ResourceSchema.class);
      String prevSchemaStr = property.getProperty(AVRO_OUTPUT_SCHEMA_PROPERTY);

      Attachments

        1. PIG-3661-2.patch
          28 kB
          Rohini Palaniswamy
        2. PIG-3661-1.patch
          28 kB
          Rohini Palaniswamy

        Issue Links

          Activity

            People

              rohini Rohini Palaniswamy
              rohini Rohini Palaniswamy
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: