Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-22665

RegionServer abort failed when AbstractFSWAL.shutdown hang

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:

      HBase 2.1.2

      Hadoop 3.1.x

      centos 7.4

      Description

      We use hbase 2.1.2,when the rs with heavy qps and rs abort with error like "Caused by: org.apache.hadoop.hbase.exceptions.TimeoutIOException: Failed to get sync result after 300000 ms for txid=36380334, WAL system stuck?"

       

      RegionServer aborted failed when AbstractFSWAL.shutdown hang

       

      jstack info always show the regionserver hang with "AbstractFSWAL.shutdown"

      "regionserver/hbase-slave-216-99:16020" #25 daemon prio=5 os_prio=0 tid=0x00007f204282c600 nid=0x34aa waiting on condition [0x00007f0fe044d000]
      java.lang.Thread.State: WAITING (parking)
      at sun.misc.Unsafe.park(Native Method)

      • parking to wait for <0x00007f18a49b2bb8> (a java.util.concurrent.locks.ReentrantLock$FairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
        at java.util.concurrent.locks.ReentrantLock$FairSync.lock(ReentrantLock.java:224)
        at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
        at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.shutdown(AbstractFSWAL.java:815)
        at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.shutdown(AbstractFSWALProvider.java:168)
        at org.apache.hadoop.hbase.wal.RegionGroupingProvider.shutdown(RegionGroupingProvider.java:221)
        at org.apache.hadoop.hbase.wal.WALFactory.shutdown(WALFactory.java:239)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.shutdownWAL(HRegionServer.java:1445)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1117)
        at java.lang.Thread.run(Thread.java:745)

       

       

       

       

        Attachments

        1. rs.log.part1
          7.57 MB
          Yechao Chen
        2. rs.log_part2.zip
          1.63 MB
          Yechao Chen
        3. jstack_20190704_2
          300 kB
          Yechao Chen
        4. jstack_20190704_1
          300 kB
          Yechao Chen
        5. jstack_20190625
          296 kB
          Yechao Chen
        6. image-2019-07-08-16-14-43-455.png
          66 kB
          Yechao Chen
        7. image-2019-07-08-16-08-26-777.png
          156 kB
          Yechao Chen
        8. image-2019-07-08-16-07-37-664.png
          69 kB
          Yechao Chen
        9. HBASE-22665-UT.patch
          7 kB
          Duo Zhang

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                chenyechao Yechao Chen
              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated: