Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4423

AvroStorage() does not validate schema at storing.

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 0.12.0
    • Fix Version/s: None
    • Component/s: piggybank
    • Labels:
    • Environment:

      EMR AMI 3.3.2

      Description

      Pig does not validate Avro schema when using AvroStorage(). I tried to validate schema both by adding schema_file input parameter and by providing schema explicitly as well. Both cases Avro file received the schema of Pig data set instead of validating schema from Avro file. When i have used the same Avro schema for Hive, it validated data successfully (if data has different schema compared to Avro then threw an error)

      store data into '$TARGET'
      USING AvroStorage(
      'schema', '{
      "type": "record",
      "name": "test",
      "fields": [
      {"name": "partner_name", "type": "string"},
      {"name": "partner_id", "type": "int"},
      {"name": "name", "type": "string"} ,
      {"name": "id", "type": "int"}
      ]
      }');
      

      or

      STORE data INTO '$TARGET' 
      USING AvroStorage('schema_file','$AVRO_SCHEMA');
      

      I have registered the following jars (downloaded from Maven repo)

      REGISTER piggybank-0.12.0.jar;
      REGISTER avro-1.7.7.jar;
      REGISTER avro-mapred-1.7.7.jar;
      REGISTER json-simple-1.1.1.jar;
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              ztalas Zoltan Talas
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: