Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4397

CSVExcelStorage incorrect output if last field value is null

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • None
    • 0.15.0
    • None
    • None
    • Running the Pig version bundled with HDP 2.1.2: 0.12.1.2.1.2.0-402

    • Reviewed

    Description

      I have the following input:

      one two
      three
       four
      

      I run this code

      Lines =
          LOAD 'test.log' USING PigStorage(' ') 
          AS ( First:chararray , Second:chararray );
      
      DUMP Lines;
      
      STORE Lines INTO 'Lines'
      USING org.apache.pig.piggybank.storage.CSVExcelStorage('\t', 'NO_MULTILINE', 'WINDOWS', 'WRITE_OUTPUT_HEADER');
      

      The output from the DUMP is correct:

      (one,two)
      (three,)
      (,four)
      

      The output from the CSVExcelStorage is incorrect:

      First   Second
      one     two
      three   three
              four
      

      The problem is that if the last field is a null then the previous value is repeated incorrectly (in this case 'three').

      Attachments

        1. PIG-4397-1.patch
          3 kB
          Daniel Dai

        Activity

          People

            daijy Daniel Dai
            nielsbasjes Niels Basjes
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: