Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-2614

AvroStorage crashes on LOADING a single bad error

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      AvroStorage dies when a single bad record exists, such as one with missing fields. This is very bad on 'big data,' where bad records are inevitable. See discussion at http://www.quora.com/Big-Data/In-Big-Data-ETL-how-many-records-are-an-acceptable-loss for more theory.

      Attachments

        1. test_avro_files.tar.gz
          0.6 kB
          Cheolsoo Park
        2. PIG-2614_2.patch
          19 kB
          Cheolsoo Park
        3. PIG-2614_1.patch
          29 kB
          Jonathan Coveney
        4. PIG-2614_0.patch
          29 kB
          Jonathan Coveney

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            jcoveney Jonathan Coveney
            russell.jurney Russell Jurney
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment