Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
None
Description
org.apache.hudi.utilities.sources.helpers.DFSPathSelector#listEligibleFiles filters the input files based on last saved checkpoint, which was the modification date from last read file. However, the last read file's modification date could be duplicated for multiple files and resulted in skipping a few of them when reading up to source limit. An illustration is shown in the attached picture.
Attachments
Attachments
Issue Links
- relates to
-
HUDI-1896 HudiStreamer Source for cloud object stores
- Open
- links to