Pig
  1. Pig
  2. PIG-2909

Add a new option for ignoring corrupted files to AvroStorage load func

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.10.0
    • Fix Version/s: 0.11
    • Component/s: piggybank
    • Labels:
      None
    • Patch Info:
      Patch Available

      Description

      Currently, AvroStorage load fails with AvroRuntimeException when encountering corrupted input files. For example,

      ERROR 2997: Unable to recreate exception from backed error: java.io.IOException: org.apache.avro.AvroRuntimeException: java.io.IOException: Invalid sync!
      	at org.apache.pig.piggybank.storage.avro.AvroStorage.getNext(AvroStorage.java:283)
      

      But it is not always desirable to fail the Pig job for bad files. It is sometimes more useful to skip them and continue.

      1. PIG-2909-avro_test_files.tar.gz
        0.4 kB
        Cheolsoo Park
      2. PIG-2909.patch
        10 kB
        Cheolsoo Park
      3. PIG-2909-2.patch
        10 kB
        Cheolsoo Park

        Issue Links

          Activity

          Bill Graham made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Cheolsoo Park made changes -
          Link This issue is related to PIG-2614 [ PIG-2614 ]
          Alan Gates made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Fix Version/s 0.11 [ 12318878 ]
          Resolution Fixed [ 1 ]
          Cheolsoo Park made changes -
          Attachment PIG-2909-2.patch [ 12544946 ]
          Cheolsoo Park made changes -
          Issue Type Bug [ 1 ] Improvement [ 4 ]
          Cheolsoo Park made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Cheolsoo Park made changes -
          Field Original Value New Value
          Attachment PIG-2909-avro_test_files.tar.gz [ 12543986 ]
          Attachment PIG-2909.patch [ 12543987 ]
          Cheolsoo Park created issue -

            People

            • Assignee:
              Cheolsoo Park
              Reporter:
              Cheolsoo Park
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development