[NIFI-5204] When node joins cluster, if a processor is stopping but cluster says the state is disabled, node ends up in inconsistent state - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.7.0
Component/s: None
Labels:
None

Description

In order to make this "easy" to replicate, I did the following:

1) Create a 2-node cluster.
2) On both nodes, update nifi.properties to set "nifi.variable.registry.properties" to "1.properties"
3) On both nodes, create 1.properties in $NIFI_HOME. For first node, set "sleep=2 mins" and for second node, set "sleep=0 millis"
4) Update DebugFlow to support expression language for the "@OnStopped Pause Time"
5) Configure flow with a DebugFlow processor. Can auto-terminate relationships and set run period to "10 secs."Set "@OnStopped Pause time" to "${sleep}"
6) Disable DebugFlow processor.
7) Disconnected Node 1.
8) Go to Node 1 in browser and Start DebugFlow.
9) Stop DebugFlow.
10) While processor is still "stopping", go back Node 2 in browser and request that Node 1 re-join the cluster.

Now, when Node 1 re-joins the cluster, it will attempt to disable the processor but won't be able to because the processor is still stopping. The following will be in the logs:

2018-05-16 15:21:50,986 WARN [Reconnect to Cluster] org.apache.nifi.controller.ProcessorNode Processor cannot be disabled because its state is set to STOPPING

So we now have a node in an inconsistent state.

Additionally, if we now go to Node 1 in our browser and unselect all components, and attempt to STOP the process group, the request that is replicated attempts to stop the DebugFlow processor. Node 2 will now fail to stop the processor because the processor is disabled. As a result, Node 2 will now be kicked out of the cluster.

Attachments

Issue Links

links to

GitHub Pull Request #2713

Activity

People

Assignee:: Mark Payne

Reporter:: Mark Payne

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 16/May/18 19:39

Updated:: 17/May/18 17:46

Resolved:: 17/May/18 17:46