Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-6465

ListHDFS: skip last should be optional

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Not A Problem
    • 1.9.2
    • None
    • Core Framework
    • None

    Description

      Current Situation

      From official documentation

      • Each time a listing is performed, the files with the latest timestamp will be excluded and picked up during the next execution of the processor. This is done to ensure that we do not miss any files, or produce duplicates, in the cases where files with the same timestamp are written immediately before and after a single execution of the processor.

      Improvement Proposal

      • If we are calling the ListHDFS only after a certain operation which populates an HDFS directory has finished, it is pointless to skip the last file, and avoiding this behavior is tricky.
      • A mandatory property "skip last" should be implemented in order to be able to actively decide whether or not this behavior is necessary, based on the use case.
      • This is also particularly useful in combination with NIFI-6462

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              AxelSync Alessandro D'Armiento
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m