Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.0.0
-
None
Description
The FileStreamSource used by StructuredStreaming first resolves globs, and then creates a ListingFileCatalog which listFiles with the resolved glob patterns. If a folder is deleted after glob resolution but before the ListingFileCatalog can list the files, we can run into a 'FileNotFoundException'.
This should not be a fatal exception for a streaming job. However we should include a warn message.
Attachments
Issue Links
- causes
-
SPARK-27676 InMemoryFileIndex should hard-fail on missing files instead of logging and continuing
- Resolved
- is duplicated by
-
SPARK-19187 querying from parquet partitioned table throws FileNotFoundException when some partitions' hdfs locations do not exist
- Resolved
- is required by
-
SPARK-24364 Files deletion after globbing may fail StructuredStreaming jobs
- Resolved
- links to