Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4423

AvroStorage() does not validate schema at storing.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.12.0
    • None
    • piggybank
    • EMR AMI 3.3.2

    Description

      Pig does not validate Avro schema when using AvroStorage(). I tried to validate schema both by adding schema_file input parameter and by providing schema explicitly as well. Both cases Avro file received the schema of Pig data set instead of validating schema from Avro file. When i have used the same Avro schema for Hive, it validated data successfully (if data has different schema compared to Avro then threw an error)

      store data into '$TARGET'
      USING AvroStorage(
      'schema', '{
      "type": "record",
      "name": "test",
      "fields": [
      {"name": "partner_name", "type": "string"},
      {"name": "partner_id", "type": "int"},
      {"name": "name", "type": "string"} ,
      {"name": "id", "type": "int"}
      ]
      }');
      

      or

      STORE data INTO '$TARGET' 
      USING AvroStorage('schema_file','$AVRO_SCHEMA');
      

      I have registered the following jars (downloaded from Maven repo)

      REGISTER piggybank-0.12.0.jar;
      REGISTER avro-1.7.7.jar;
      REGISTER avro-mapred-1.7.7.jar;
      REGISTER json-simple-1.1.1.jar;
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            ztalas Zoltan Talas
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: