Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
6.0.0
Description
thisisnic found that this example crashes:
library(arrow) library(dplyr) write_dataset(group_by(iris, Species), "iris_data") open_dataset("iris_data") %>% group_by(Species) %>% summarise(mean(Sepal.Length)) %>% collect()
There are two bugs here:
- StopProducing is written in a way that causes a future to be finished twice, triggering a DCHECK.
- Consume() doesn't set the length of the key column batch, causing a spurious error because the group ID datum and the values datum will have different lengths.
Attachments
Issue Links
- is related to
-
ARROW-14583 [R][C++] Crash when summarizing after filtering to no rows on partitioned data
- Resolved
- links to