Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
7.0.0, 8.0.0
Description
`pyarrow.csv.open_csv` throws ArrowInvalid if csv does not end with a new line and is above 16384 lines. Tested with both pyarrow 7.0.0 and 8.0.0. Error seen both in production app and on developer laptop.
Here's a minimal case for reproducing the issue:
```python
import pyarrow as pa
import pyarrow.csv
from io import BytesIO
for _ in pa.csv.open_csv(BytesIO('\n'.join(['review_id,filter_outcome'] + ['62593aaec7628b203bad4c6e,fabrication']*16385).encode())): pass
```
Error is thrown:
ArrowInvalid: CSV parse error: Expected 2 columns, got 1:
Attachments
Issue Links
- links to