Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-23613

ProcedureExecutor check StuckWorkers blocked by DeadServerMetricRegionChore

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.2.2
    • Fix Version/s: 3.0.0-alpha-1, 2.3.0, 2.2.3
    • Component/s: None
    • Labels:
      None

      Description

      After debuging, i find WorkerMonitor in ProcedureExecutor do not execute for a while because it is blocked by DeadServerMetricRegionChore.
      TimeoutExecutorThread execute not only WorkerMonitor, but also DeadServerMetricRegionChore RegionInTransitionChore...

      "ProcExecTimeout" #1052 daemon prio=5 os_prio=0 tid=0x00007f5c98cc4000 nid=0x229 waiting on condition [0x00007f5c2f857000]
         java.lang.Thread.State: WAITING (parking)
              at sun.misc.Unsafe.park(Native Method)
              - parking to wait for  <0x00000005c312ad80> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
              at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
              at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
              at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
              at org.apache.hadoop.hbase.master.assignment.RegionStateNode.lock(RegionStateNode.java:313)
              at org.apache.hadoop.hbase.master.assignment.AssignmentManager$DeadServerMetricRegionChore.periodicExecute(AssignmentManager.java:1186)
              at org.apache.hadoop.hbase.master.assignment.AssignmentManager$DeadServerMetricRegionChore.periodicExecute(AssignmentManager.java:1163)
              at org.apache.hadoop.hbase.procedure2.TimeoutExecutorThread.executeInMemoryChore(TimeoutExecutorThread.java:120)
              at org.apache.hadoop.hbase.procedure2.TimeoutExecutorThread.execDelayedProcedure(TimeoutExecutorThread.java:99)
              at org.apache.hadoop.hbase.procedure2.TimeoutExecutorThread.run(TimeoutExecutorThread.java:66)
      
      "PEWorker-1" #1053 daemon prio=5 os_prio=0 tid=0x00007f5c98cc5800 nid=0x22a in Object.wait() [0x00007f5c2f756000]
         java.lang.Thread.State: TIMED_WAITING (on object monitor)
              at java.lang.Object.wait(Native Method)
              at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:168)
              - locked <0x00000005839f18b0> (a java.util.concurrent.atomic.AtomicBoolean)
              at org.apache.hadoop.hbase.client.HTable.put(HTable.java:540)
              at org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateRegionLocation(RegionStateStore.java:209)
              at org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateUserRegionLocation(RegionStateStore.java:203)
              at org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateRegionLocation(RegionStateStore.java:141)
              at org.apache.hadoop.hbase.master.assignment.AssignmentManager.persistToMeta(AssignmentManager.java:1742)
              at org.apache.hadoop.hbase.master.assignment.RegionRemoteProcedureBase.execute(RegionRemoteProcedureBase.java:298)
              at org.apache.hadoop.hbase.master.assignment.RegionRemoteProcedureBase.execute(RegionRemoteProcedureBase.java:58)
              at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:962)
              at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1648)
              at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1395)
              at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1100(ProcedureExecutor.java:78)
              at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1965)
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                binlijin Lijin Bin
                Reporter:
                binlijin Lijin Bin
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: