Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-2545

Unresponsive region server, potential deadlock

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 0.20.4
    • 0.20.5, 0.90.0
    • regionserver
    • None
    • Ubuntu 8.04.4 LTS, Hadoop 0.20.2, Amazon EC2 x-large cluster

    • Reviewed

    Description

      We have a 15-node (14RS+1Master) hbase cluster. We just recently upgraded from 0.20.3 to 0.20.4. This cluster does have colocated hadoop MR, but we mostly use another MR cluster to hit it. Upon start, the cluster runs the jobs fine for about an hour. Afterwards, an RS seems to have locked up. Doing a get for a row in region being served by that region server hangs (cannot even ctrl+c out of the hbase shell). Attached is the thread dump. Verified in UI that the affect server runs on 0.20.4 and not 0.20.3.

      Attachments

        1. hbase-hadoop-regionserver-mi-prod-hbase05.ec2.biz360.com.out
          221 kB
          Kris Jirapinyo
        2. hbase-2545.txt
          0.3 kB
          Todd Lipcon
        3. hbase-2545.txt
          0.8 kB
          Todd Lipcon
        4. hbase-2545.txt
          0.8 kB
          Todd Lipcon
        5. hbase-2545.txt
          5 kB
          Todd Lipcon
        6. 2545-trunk.txt
          5 kB
          Michael Stack

        Activity

          People

            tlipcon Todd Lipcon
            kjirapinyo Kris Jirapinyo
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: