Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-34672

HA deadlock between JobMasterServiceLeadershipRunner and DefaultLeaderElectionService

    XMLWordPrintableJSON

Details

    Description

      We recently observed a deadlock in the JM within the HA system.
      (see below for the thread dump)

      mapohl and I looked a bit into it and there appears to be a race condition when leadership is revoked while a JobMaster is being started.
      It appears to be caused by JobMasterServiceLeadershipRunner#createNewJobMasterServiceProcess forwarding futures while holding a lock; depending on whether the forwarded future is already complete the next stage may or may not run while holding that same lock.
      We haven't determined yet whether we should be holding that lock or not.

      "DefaultLeaderElectionService-leadershipOperationExecutor-thread-1" #131 daemon prio=5 os_prio=0 cpu=157.44ms elapsed=78749.65s tid=0x00007f531f43d000 nid=0x19d waiting for monitor entry  [0x00007f53084fd000]
         java.lang.Thread.State: BLOCKED (on object monitor)
              at org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.runIfStateRunning(JobMasterServiceLeadershipRunner.java:462)
              - waiting to lock <0x00000000f1c0e088> (a java.lang.Object)
              at org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.revokeLeadership(JobMasterServiceLeadershipRunner.java:397)
              at org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService.notifyLeaderContenderOfLeadershipLoss(DefaultLeaderElectionService.java:484)
              at org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService$$Lambda$1252/0x0000000840ddec40.accept(Unknown Source)
              at java.util.HashMap.forEach(java.base@11.0.22/HashMap.java:1337)
              at org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService.onRevokeLeadershipInternal(DefaultLeaderElectionService.java:452)
              at org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService$$Lambda$1251/0x0000000840dcf840.run(Unknown Source)
              at org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService.lambda$runInLeaderEventThread$3(DefaultLeaderElectionService.java:549)
              - locked <0x00000000f0e3f4d8> (a java.lang.Object)
              at org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService$$Lambda$1075/0x0000000840c23040.run(Unknown Source)
              at java.util.concurrent.CompletableFuture$AsyncRun.run(java.base@11.0.22/CompletableFuture.java:1736)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.22/ThreadPoolExecutor.java:1128)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.22/ThreadPoolExecutor.java:628)
              at java.lang.Thread.run(java.base@11.0.22/Thread.java:829)
      
      "jobmanager-io-thread-1" #636 daemon prio=5 os_prio=0 cpu=125.56ms elapsed=78699.01s tid=0x00007f5321c6e800 nid=0x396 waiting for monitor entry  [0x00007f530567d000]
         java.lang.Thread.State: BLOCKED (on object monitor)
              at org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService.hasLeadership(DefaultLeaderElectionService.java:366)
              - waiting to lock <0x00000000f0e3f4d8> (a java.lang.Object)
              at org.apache.flink.runtime.leaderelection.DefaultLeaderElection.hasLeadership(DefaultLeaderElection.java:52)
              at org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.isValidLeader(JobMasterServiceLeadershipRunner.java:509)
              at org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.lambda$forwardIfValidLeader$15(JobMasterServiceLeadershipRunner.java:520)
              - locked <0x00000000f1c0e088> (a java.lang.Object)
              at org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner$$Lambda$1320/0x0000000840e1a840.accept(Unknown Source)
              at java.util.concurrent.CompletableFuture.uniWhenComplete(java.base@11.0.22/CompletableFuture.java:859)
              at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(java.base@11.0.22/CompletableFuture.java:837)
              at java.util.concurrent.CompletableFuture.postComplete(java.base@11.0.22/CompletableFuture.java:506)
              at java.util.concurrent.CompletableFuture.complete(java.base@11.0.22/CompletableFuture.java:2079)
              at org.apache.flink.runtime.jobmaster.DefaultJobMasterServiceProcess.registerJobMasterServiceFutures(DefaultJobMasterServiceProcess.java:124)
              at org.apache.flink.runtime.jobmaster.DefaultJobMasterServiceProcess.lambda$new$0(DefaultJobMasterServiceProcess.java:114)
              at org.apache.flink.runtime.jobmaster.DefaultJobMasterServiceProcess$$Lambda$1319/0x0000000840e1a440.accept(Unknown Source)
              at java.util.concurrent.CompletableFuture.uniWhenComplete(java.base@11.0.22/CompletableFuture.java:859)
              at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(java.base@11.0.22/CompletableFuture.java:837)
              at java.util.concurrent.CompletableFuture.postComplete(java.base@11.0.22/CompletableFuture.java:506)
              at java.util.concurrent.CompletableFuture$AsyncSupply.run(java.base@11.0.22/CompletableFuture.java:1705)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.22/ThreadPoolExecutor.java:1128)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.22/ThreadPoolExecutor.java:628)
              at java.lang.Thread.run(java.base@11.0.22/Thread.java:829)
      

      Attachments

        Issue Links

          Activity

            People

              mapohl Matthias Pohl
              chesnay Chesnay Schepler
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated: