Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-2575

Fault scenario of dead root drive on RS causes cluster lockup

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Cannot Reproduce
    • 0.90.0
    • None
    • regionserver
    • None

    Description

      We performed a fault test where we physically pulled the root drive out of a machine while it was on. The regionserver continued to run fine with existing clients. But any new clients that tried to connect to it for RPC would not work correctly. So when I started a new client, that client made no progress. Despite this, the RS continued to happily heartbeat to the master, so the master did not remove it from the cluster. Note that in this case, we were logging to NFS, and the logs continued to write, but no exceptions shown.

      Attachments

        Activity

          People

            Unassigned Unassigned
            tlipcon Todd Lipcon
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: