Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-7059

Upon restart of a nifi node in a clustered environment, prioritizers may not be properly set

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.8.0, 1.9.0, 1.10.0, 1.9.1, 1.9.2, 1.11.0
    • 1.12.0, 1.11.1
    • Core Framework
    • None

    Description

      When a NiFi node is restarted, the queue prioritizers sometimes do not take effect, even though the UI shows that they are selected. This can lead to data being improperly ordered.

      Attachments

        Issue Links

          Activity

            joewitt Joe Witt added a comment -

            +1 will merge to master

            joewitt Joe Witt added a comment - +1 will merge to master

            Commit be34767c8a0670bce564f6aac4490e3250817675 in nifi's branch refs/heads/master from Mark Payne
            [ https://gitbox.apache.org/repos/asf?p=nifi.git;h=be34767 ]

            NIFI-7059: This closes #4007. Fixed bug that results in priorities not properly being set in the SocketLoadBalancedFlowFileQueue. Even though the queue's setPriorities method was called, the underlying may not have contained the localPartition. As a result, when setPriorities() was called, it did not properly delegate that call to . As a result, the queue knew that the Prioritizers were set but the local queue did not apply them. This happened due to a race condition between queue creation and NodeClusterCoordinator learning the Local Node Identifier. Additionally, updated NodeClusterCoordinator so that it will persist its state when it does learn the Local Node Identifier, as that was not being persisted. In testing this, also encounterd a deadlock in a particular Integration Test that was run, around AbstractFlowFileQueue and its synchronization. Because the 'synchronized' keyword synchronizes on 'this' and the concrete implementation also uses 'synchronized' and the result is the same 'this', a deadlock can occur that is difficult to understand. As a result, refactored AbstractFlowFileQueue to instead use read/write locks.

            Signed-off-by: Joe Witt <joewitt@apache.org>

            jira-bot ASF subversion and git services added a comment - Commit be34767c8a0670bce564f6aac4490e3250817675 in nifi's branch refs/heads/master from Mark Payne [ https://gitbox.apache.org/repos/asf?p=nifi.git;h=be34767 ] NIFI-7059 : This closes #4007. Fixed bug that results in priorities not properly being set in the SocketLoadBalancedFlowFileQueue. Even though the queue's setPriorities method was called, the underlying may not have contained the localPartition. As a result, when setPriorities() was called, it did not properly delegate that call to . As a result, the queue knew that the Prioritizers were set but the local queue did not apply them. This happened due to a race condition between queue creation and NodeClusterCoordinator learning the Local Node Identifier. Additionally, updated NodeClusterCoordinator so that it will persist its state when it does learn the Local Node Identifier, as that was not being persisted. In testing this, also encounterd a deadlock in a particular Integration Test that was run, around AbstractFlowFileQueue and its synchronization. Because the 'synchronized' keyword synchronizes on 'this' and the concrete implementation also uses 'synchronized' and the result is the same 'this', a deadlock can occur that is difficult to understand. As a result, refactored AbstractFlowFileQueue to instead use read/write locks. Signed-off-by: Joe Witt <joewitt@apache.org>

            People

              markap14 Mark Payne
              markap14 Mark Payne
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h