Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
4.0.0
-
None
-
R version 4.0.5 (2021-03-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Description
While the following snippet works with arrow 3.0.0, it fails after updating to arrow 4.0.0.
An example CSV that can be used to replicate this can be found here
. ├── data │ └── 2021-04-25-Karlen-pypm.csv └── test.R
library(arrow) library(tidyverse) sch <- schema(forecast_date=string(), target=string(), target_end_date=string(), location=string(), type=string(), quantile=string(), value=string()) ds = open_dataset("data", format = "csv", schema = sch) ds %>% select(target) %>% collect()
The error is:
Error: Invalid: In CSV column #3: CSV conversion error to int64: invalid value 'US'
However, it should be noted that these all run well and return a data frame with the right schema.
ds %>% collect() ds %>% select(target, location) %>% collect()
Attachments
Issue Links
- duplicates
-
ARROW-12500 [C++][Dataset] Consolidate similar tests for file formats
-
- Resolved
-