Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-2489

GetFile inability to remove source file results in duplicate files (PutFile) and dataloss (Site2Site)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.7.0, 0.6.1
    • None
    • Core Framework
    • None
    • Tested with CentOS 6 and 7.
      Cifs-utils 4.8.1-20.el6 (CentOS 6) and Cifs-utils 6.2-7.el7 (CentOS 7)
      Windows Server 2003 and Windows Server 2008 as CIFS sources.

    Description

      If GetFile is unable to remove the sourcefile from the windows cifs mapping (file is locked by another application) it also fails to remove other files from the same batch. (Unknown why). It then again sources those same file into NIFI on the next run, and fails to remove again. If the destination is PutFile with Conflict Resulution Strategy set to 'fail' the failure que builds up in a alarming rate.

      (0.6.1 and 0.7.0 on CentOS 6) if the destination is not a PutFile, but a Site2Site Output port the files can be dropped due to missing content.
      Example log extract: http://pastebin.com/dJ8UibwR

      Environment
      Have replicated the GetFile behaviour on both CentOS 6 and 7 with CIFS mounts from a couple different Windows servers afterwards. In the original case GetFile ran on CRON to source files from the Windows server folders. Timer driven GetFile is even worse since it builds up duplicates even faster.

      Trying to remove the file manually with rm in bash gives:
      rm: cannot remove 'Filename': Text file busy

      The most troubling here is that one locked file affects many others, depending on batch size. It is not only the locked file that is affected. The first occurrence of this with Site2Site dropped 9 files, out of 10, since the last of those 10 had a lock.

      1) What should the expected behavior of GetFile in a edge case where it is unable to remove a source file? (revert?, remember in state file is read?)
      2) Why does delete lock on one file prevent other files in the same batch to be deleted? (They are loaded into NIFI as flowfiles, but not deleted either.)

      Attachments

        Activity

          People

            Unassigned Unassigned
            kefevs Kefevs Pirkibo
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: