Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-3141

Giving CSVExcelStorage an option to handle header rows

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.11
    • 0.12.0
    • piggybank
    • None
    • Patch Available

    Description

      Adds an argument to CSVExcelStorage to skip the header row when loading. This works properly with multiple small files each with a header being combined into one split, or a large file with a single header being split into multiple splits.

      Also fixes a few bugs with CSVExcelStorage, including PIG-2470 and a bug involving quoted fields at the end of a line not escaping properly.

      Attachments

        1. csv.patch
          88 kB
          Jonathan Packer
        2. csv_updated.patch
          89 kB
          Jonathan Packer
        3. PIG-3141_update_3.diff
          75 kB
          Jonathan Packer
        4. PIG-3141_update_4.diff
          76 kB
          Jonathan Packer

        Issue Links

          Activity

            People

              jpacker Jonathan Packer
              jpacker Jonathan Packer
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: