Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-2195

AvroStorage fails to STORE when LOADing via PigStorage

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.10.0
    • Component/s: None
    • Labels:
      None
    • Release Note:
      AvroStorage support for using a schema from an Avro schema file.

      Description

      Reading data via PigStorage and writing it via AvroStorage fails with an exception like this

      java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be cast to org.apache.avro.generic.IndexedRecord

      The Pig script in this section of the documentation shows an example like this that fails:

      http://linkedin.jira.com/wiki/display/HTOOLS/AvroStorage+-+Pig+support+for+Avro+data#AvroStorage-PigsupportforAvrodata-A.Howtostoredataindifferentways.

      A workaround currently exists to produce avro from TSVs like this:

      avro = LOAD 'inputPath/' AS (foo);
      STORE avro INTO 'outputPath/' USING oap.piggybank.storage.avro.AvroStorage(
        '{"data":"data_file.avro",
          "same":"data_file.avro", "field0":"def:bar"}');
      

      This is redundant though and data and same seem to indicate the same thing. This approach also requires an existing avro data file to exist. This patch will make the following alternate constructor syntax's work as well.

      1. Read schema from an existing data file:
          '{"data":"data_file.avro", "field0":"def:bar"}');
        
      2. Read schema from an existing schema file:
          '{"schema_file":"data_file.avsc", "field0":"def:bar"}');
        

        Attachments

        1. expected_testRecordSplitFromText2.avro
          0.3 kB
          Bill Graham
        2. expected_testRecordSplitFromText1.avro
          0.2 kB
          Bill Graham
        3. PIG-2195_1.patch
          20 kB
          Bill Graham

          Activity

            People

            • Assignee:
              billgraham Bill Graham
              Reporter:
              billgraham Bill Graham
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: