[ARROW-16926] csv reader errors clobbered by subsequent reads - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 9.0.0
Component/s: Go
Labels:
- pull-request-available

External issue URL:
https://github.com/apache/arrow/issues/32245
Language:
- Go

Description

Currently you can reproduce this issue by reading a csv file with garbage string values where float64 are expected. If you place the bad data in the first part of the file, then subsequent r.r.Read() will clobber the parse err that was set inside r.read(rec)

So at the bottom of the loop body, r.read(rec) is called, we end up in func (r *Reader) parseFloat64(field array.Builder, str string)
it encounters an error, and sets err on the reader:
v, err := strconv.ParseFloat(str, 64)
if err != nil && r.err == nil

{ r.err = err field.AppendNull() return }

However, when we come back out of the call to the loop, we advance in the for loop without checking the err and on the subsequent call to r.r.Read() we clobber the r.err.

This means that if the last chunk has no error, after we read the csv, calls to r.Err() on the reader will return nil, even though an err took place during parse.

Attachments

Issue Links

links to

GitHub Pull Request #13451

Activity

People

Assignee:: Unassigned

Reporter:: Whispell Whispell

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 28/Jun/22 15:25

Updated:: 11/Jan/23 11:47

Resolved:: 29/Jun/22 17:49

Time Tracking

Estimated:

168h

Remaining:

167h 20m

Logged:

40m