Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-16926

csv reader errors clobbered by subsequent reads

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 9.0.0
    • Go

    Description

      Currently you can reproduce this issue by reading a csv file with garbage string values where float64 are expected. If you place the bad data in the first part of the file, then subsequent r.r.Read() will clobber the parse err that was set inside r.read(rec)

      So at the bottom of the loop body, r.read(rec) is called, we end up in func (r *Reader) parseFloat64(field array.Builder, str string)
      it encounters an error, and sets err on the reader:
      v, err := strconv.ParseFloat(str, 64)
      if err != nil && r.err == nil

      { r.err = err field.AppendNull() return }

      However, when we come back out of the call to the loop, we advance in the for loop without checking the err and on the subsequent call to r.r.Read() we clobber the r.err.

      This means that if the last chunk has no error, after we read the csv, calls to r.Err() on the reader will return nil, even though an err took place during parse.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned Assign to me
            wwhispell Whispell Whispell
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 168h
                168h
                Remaining:
                Remaining Estimate - 167h 20m
                167h 20m
                Logged:
                Remaining Estimate - 167h 20m
                40m

                Slack

                  Issue deployment