Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Won't Fix
-
0.16.0
-
None
-
Linux 64-bit 5.4.15
Description
Same issue as reported for feather::read_feather (https://issues.apache.org/jira/browse/ARROW-7823);
For the R arrow package, the "read_parquet()" function currently does not respect "options(stringsAsFactors = FALSE)", leading to unexpected/inconsistent behavior.
Example:
library(arrow) library(readr) options(stringsAsFactors = FALSE) write_tsv(head(iris), 'test.tsv') write_parquet(head(iris), 'test.parquet') head(read.delim('test.tsv', sep='\t')$Species) # [1] "setosa" "setosa" "setosa" "setosa" "setosa" "setosa" head(read_tsv('test.tsv', col_types = cols())$Species) # [1] "setosa" "setosa" "setosa" "setosa" "setosa" "setosa" head(read_parquet('test.parquet')$Species) # [1] setosa setosa setosa setosa setosa setosa # Levels: setosa versicolor virginica
Versions:
- R 3.6.2
- arrow_0.15.1.9000
Attachments
Issue Links
- is related to
-
ARROW-7823 Have feather::read_feather respect options(stringsAsFactors = FALSE)
- Closed