Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-9268

Client doesn't recover from a stalled region server

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.95.2
    • Fix Version/s: 0.98.0, 0.96.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Got this testing the 0.95.2 RC.

      I killed -STOP a region server and let it stay like that while running PE. The clients didn't find the new region locations and in the jstack were stuck doing RPC. Eventually I killed -CONT and the client printed these:

      Exception in thread "TestClient-6" java.lang.RuntimeException: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 128 actions: IOException: 90 times, SocketTimeoutException: 38 times,

        Attachments

        1. 9268-hack.patch
          0.8 kB
          Nicolas Liochon
        2. 9268.v1.patch
          2 kB
          Nicolas Liochon

          Activity

            People

            • Assignee:
              nkeywal Nicolas Liochon
              Reporter:
              jdcryans Jean-Daniel Cryans
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: