Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-3141

Giving CSVExcelStorage an option to handle header rows

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.11
    • Fix Version/s: 0.12.0
    • Component/s: piggybank
    • Labels:
      None
    • Patch Info:
      Patch Available

      Description

      Adds an argument to CSVExcelStorage to skip the header row when loading. This works properly with multiple small files each with a header being combined into one split, or a large file with a single header being split into multiple splits.

      Also fixes a few bugs with CSVExcelStorage, including PIG-2470 and a bug involving quoted fields at the end of a line not escaping properly.

        Attachments

        1. csv_updated.patch
          89 kB
          Jonathan Packer
        2. csv.patch
          88 kB
          Jonathan Packer
        3. PIG-3141_update_3.diff
          75 kB
          Jonathan Packer
        4. PIG-3141_update_4.diff
          76 kB
          Jonathan Packer

          Issue Links

            Activity

              People

              • Assignee:
                jpacker Jonathan Packer
                Reporter:
                jpacker Jonathan Packer
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: