Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
5.0.0
-
fsspec 2021.4.0
Description
It appears that files opened for read using pyarrow.parquet.read_table (and therefore pyarrow.parquet.ParquetDataset) are not explicitly closed.
This seems to be the case for both use_legacy_dataset=True and False. The files don't remain open at the os level (verified using lsof). They do however seem to rely on the python gc to close.
My use case is that i'd like to use a custom fsspec filesystem that interfaces to an s3 like API. It handles the remote download of the parquet file and passes to pyarrow a handle of a temporary file downloaded locally. It then is looking for an explicit close() or _exit_() to then clean up the temp file.
Attachments
Attachments
Issue Links
- relates to
-
ARROW-16421 [R] Permission error on Windows when deleting file previously accessed with open_dataset
- Open
- links to