Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.0.0
-
None
-
windows 10, R 4.0.0, arrow 3.0.0
Description
On windows 10, reading large feather objects in R seems to lead to hanging on a repeat read.
This issue has been reproduced on 3 different windows machines. All running win 10, R 4.0.0 (or later).
read_feather does not hang if using version = 1, or using uncompressed with version 2.
This issue does not happen on tests on linux (Ubuntu 20.04 atleast)
Example:
library(arrow)
m <- data.frame(x = rnorm(7e6), y = rnorm(5), b = rnorm(5), n = rnorm(5), c = c("a", "n"))
write_feather(m, "test.feather4", version = 2, compression = "lz4") # does not hang with uncompressed, but does with lz4 and zstd
for (j in 1:50)
{ y <- read_feather("test.feather4") # hangs after an unpredictable number of reads, just on windows though print(paste0("feather read ", j, "...")) }
Interestingly, a work around is to use read_feather but call just one column at a time. This does not hang so far.
e.g. y returns the full data frame, and this doesn't hang on repeated reads:
y <- lapply(cols, function(col) {
read_feather("test.feather4", col_select = all_of(col))
})
Attachments
Issue Links
- is duplicated by
-
ARROW-11413 [R] Windows multithreading error: filtering datasets
- Resolved
-
ARROW-13293 [R] open_dataset followed by collect hangs (while compute works)
- Resolved
- relates to
-
ARROW-8379 [R] Investigate/fix thread safety issues (esp. Windows)
- Resolved