[NIFI-7352] Improve PutFile State Handling - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: Core Framework
Labels:
- Processor
- PutFile

Flags:

Important

Description

Currently PutFile has three conflict resolution states: REPLACE, IGNORE, FAIL. REPLACE writes the new file to disk over the old file and transfers the file to SUCCESS. FAIL does not replace the file on disk and transfers the file to FAIL. IGNORE does not replace the file on disk and transfers the file to SUCCESS. This breakout is less than useful, it is actively inviting misunderstanding and miss-use. It is very easy to assume IGNORE would instead have the following behavior: write to disk, but keep both original and new file by appending notation information to the end of the filename - similar to how filename conflicts are handled in other programs. I have personal experience with this misinterpretation causing a project to drop data for an extended period of time without realizing it. Additionally, the FAIL state is not optimally useful in its current state as it is indistinguishable from other failure states, such as folder does not exist or lack of write permissions.

Desired result: there should be a way to key off a greater degree of detail from a PutFile processor. The easiest from a user perspective would be correcting the output queues to include a "FAIL_DUPLICATE" output, opposed to a single generic "FAIL" output. This would remove the need for "IGNORE", since that function could be performed by using "FAIL_DUPLICATE" in the desired way - most likely by auto-terminating that relationship. Barring that, an attribute added to the flow file on output could give better indication of what happened related to the success or failure of the processor - was it ignored? Written to disk? if it failed, what was the failure: duplicate filename, write permission, folder didn't exist?

A note toward backwards compatibility: I think the more likely result from the NiFi team is the attribute route since it prevents breaking backwards compatibility, however, I would caution that this also means teams which are using "IGNORE" with an incorrect understanding of what that option means will continue to be unaware they are dropping data.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Frederick Pletz

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 10/Apr/20 16:15

Updated:: 10/Apr/20 16:33