Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-512

Allow GetFile to pull in data without deleting the local file

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • None
    • Extensions
    • None

    Description

      There have been several people asking for this capability. Currently, when we do a file listing, it's placed into a HashSet, so there is no ordering for how we pull the files in. My proposal is that we instead order the files such that we pull the oldest file first and keep track of the latest timestamp that we've pulled in. This way on restart we can resume where we left off.

      I would create a FileOutputStream and keep it open. Write out the timestamp each time we pull data in. Then periodically flush the data to disk. Perhaps every second or so - maybe this should be configurable. We need a tradeoff between how much possible duplication we get and how much time we spend persisting the timestamp.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              markap14 Mark Payne
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: