Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-9217

Possible deadlock when node is disconnected

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 1.15.0
    • Core Framework
    • None

    Description

      When offloading a node, I encountered a deadlock. Grabbing a thread dump shows the following two threads are in a deadlock:

      "Disconnect from Cluster" Id=167 BLOCKED  on org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@1f9bfb51 ** DEADLOCKED THREAD ** ** MONITOR-DEADLOCKED THREAD **
              at org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener.setLeader(CuratorLeaderElectionManager.java:530)
              at org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener.disable(CuratorLeaderElectionManager.java:497)
              at org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager.unregister(CuratorLeaderElectionManager.java:182)
              - waiting on org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager@7aa4e8dc
              at org.apache.nifi.controller.FlowController.onClusterDisconnect(FlowController.java:2456)
              at org.apache.nifi.controller.FlowController.setClustered(FlowController.java:2434)
              at org.apache.nifi.controller.StandardFlowService.disconnect(StandardFlowService.java:771)
              at org.apache.nifi.controller.StandardFlowService.handleDisconnectionRequest(StandardFlowService.java:752)
              at org.apache.nifi.controller.StandardFlowService.access$400(StandardFlowService.java:112)
              at org.apache.nifi.controller.StandardFlowService$3.run(StandardFlowService.java:425)
              at java.lang.Thread.run(Thread.java:748)
              Number of Locked Synchronizers: 2
              - java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@3d3a1057
              - java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@3d3cb514
      
      
      "Process Cluster Protocol Request-3" Id=165 BLOCKED  on org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager@7aa4e8dc ** DEADLOCKED THREAD ** ** MONITOR-DEADLOCKED THREAD **
              at org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager.registerPollTime(CuratorLeaderElectionManager.java:304)
              at org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager.getLeader(CuratorLeaderElectionManager.java:293)
              at org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener.verifyLeader(CuratorLeaderElectionManager.java:556)
              at org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener.isLeader(CuratorLeaderElectionManager.java:510)
              - waiting on org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@1f9bfb51
              at org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$LeaderRole.isLeader(CuratorLeaderElectionManager.java:451)
              at org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager.isLeader(CuratorLeaderElectionManager.java:261)
              at org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.isActiveClusterCoordinator(NodeClusterCoordinator.java:823)
              at org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.handleNodeStatusChange(NodeClusterCoordinator.java:1168)
              at org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.handle(NodeClusterCoordinator.java:1097)
              at org.apache.nifi.cluster.protocol.impl.SocketProtocolListener.dispatchRequest(SocketProtocolListener.java:176)
              at org.apache.nifi.io.socket.SocketListener$2$1.run(SocketListener.java:131)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at java.lang.Thread.run(Thread.java:748)
              Number of Locked Synchronizers: 1
              - java.util.concurrent.ThreadPoolExecutor$Worker@c4ecdba
      

      Attachments

        Issue Links

          Activity

            People

              pgrey Paul Grey
              markap14 Mark Payne
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1.5h
                  1.5h