Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
8.0.0, 8.0.1, 9.0.0
-
None
-
None
-
Windows 10
R 4.2.1
RStudio 22.07.1
Arrow 9.0 (fails on arrow 8 as well)
Description
Hello,
I encountered this issue because it breaks my tests when I run
rhub::check_for_cran()
Because of this, I know it only affects Windows, all other OS checks pass.
If you write files to a directory using arrow's
write_*
functions, and then
collect(open_dataset(directory))
you cannot delete a file in the directory, you get an error. This is best demonstrated in a reprex:
# setup ------------------------------------------------------------------------ local_prefix <- tempfile() df <- data.frame(a = 1:5, b = letters[1:5]) # works fine ------------------------------------------------------------------- fs <- LocalFileSystem$create() fs$CreateDir(local_prefix) fsdir <- fs$cd(local_prefix) write_parquet(df, fsdir$path("1.parquet")) #open_dataset(local_prefix) %>% collect() fsdir$DeleteFile("1.parquet") unlink(local_prefix, recursive = TRUE) # doesn't work ----------------------------------------------------------------- fs <- LocalFileSystem$create() fs$CreateDir(local_prefix) fsdir <- fs$cd(local_prefix) write_parquet(df, fsdir$path("1.parquet")) open_dataset(local_prefix) %>% collect() # <-- ERROR IS CAUSED BY THIS fsdir$DeleteFile("1.parquet") # <-- HERE IS WHERE YOU GET AN ERROR unlink(local_prefix, recursive = TRUE)
Here is the error I keep getting:
Error: IOError: Cannot delete file 'C:/Users/riaz/AppData/Local/Temp/Rtmp8qUlcx/file233c22f923d0/1.parquet'. Detail: [Windows error 32] The process cannot access the file because it is being used by another process.
Note that
- I do not create an object from the `open_dataset` function. I simply call it.
- I also call `collect` in order to pull the data. So I cannot see why the connection to the file should exist after collect is called
- as mentioned above, all other OSes don't exhibit this behaviour.
- my environment pane looks identical in both instances.
- I do not need to restart R to delete the file. I can simply clear all objects from the workspace (rm(list = ls()) and then it works fine.
Attachments
Issue Links
- is superceded by
-
ARROW-16421 [R] Permission error on Windows when deleting file previously accessed with open_dataset
- Open