Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Duplicate
-
None
-
None
-
None
Description
Hi -
I recently upgraded to Arrow 6.0.1 and am using it in R.
Whenever reading a large file (~10gb) in Windows it randomly freezes sometimes. I can see the memory being allocated in the first 10-20 seconds, but then nothing happens and R just doesn't respond (the R process becomes idle too).
I'm using the option options(arrow.use_threads=FALSE).
I didn't have this issue with the previous version (0.15.1) I was using. And the file reads fine under Linux.
I would post a reproducible example but it happens randomly. I even thought I would just read large files in pieces by first getting all the distinct sections of a specific column (with compute>collect) but that hangs too.
Any ideas would be appreciated.
Edit
Not sure if it makes sense to anyone but after a few tries it seems that the issue only happens in Rstudio. In the R console it loads it fine. All I'm executing is the below.
options(arrow.use_threads=FALSE)
aa <- arrow::read_arrow('.../file.arrow5')
One thing I want to point out that the underlying Rscript process under Rstudio seems to definitely use more than one core when executing the above.
Edit2
Using arrow::set_cpu_count(1) seems to solve the issue.
Attachments
Issue Links
- is duplicated by
-
ARROW-15730 [R] Memory usage in R blows up
-
- Closed
-