Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Duplicate
-
5.0.0
-
None
-
OS = Ubuntu 20.04
I use Architect IDE (an ide base on eclipse). But the crash also happens with just R console. R = 3.6.3
See attached files for session info output and an R crash report.
Description
R (3.6.3) crashes when querying a dataset using the "?arrow:: Dataset" functionality when the following conditions are met:
- The dataset to query contains a data-time/time column
- An empty selection is made with dplyr::filter on the Dataset object
- the dplyr::collection method is called. -> (at this point the crash happens)
This crash happens both when the dataset is locally defined or situated on an S3 bucket.
Here is a minimal example to reproduce the bug:
library(dplyr)
library(lubridate)
# If you remove the dataTime column no crashing occurs.
df <- tibble(
time = seq(5,10,length.out = 10000),
dateTime = as_datetime(1511870400) + time # dataTime columns causes crash!
)
file <- tempdir()
arrow::write_dataset(df, file)
testdf <- arrow::open_dataset(file) %>%
# filter(time > 5 & time <6) %>% # When selecting non-empty it does not crash
filter(time < 5 ) %>% # select empty and it crashes!
collect()# it crashes when you do collect()
R crashes with the following message:
-
-
-
- caught segfault ****
address 0x8, cause 'memory not mapped'
- caught segfault ****
-
-
I have included in the attachment the full R console output when running the above code.
Attachments
Attachments
Issue Links
- duplicates
-
ARROW-13761 [R] arrow::filter() crashes (aborts R session)
- Resolved