Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-9480

Regions are unexpectedly made offline in certain failure conditions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.98.0, 0.96.0
    • None
    • None
    • Reviewed

    Description

      Came across this issue (HBASE-9338 test):
      1. Client issues a request to move a region from ServerA to ServerB
      2. ServerA is compacting that region and doesn't close region immediately. In fact, it takes a while to complete the request.
      3. The master in the meantime, sends another close request.
      4. ServerA sends it a NotServingRegionException
      5. Master handles the exception, deletes the znode, and invokes regionOffline for the said region.
      6. ServerA fails to operate on ZK in the CloseRegionHandler since the node is deleted.

      The region is permanently offline.

      There are potentially other situations where when a RegionServer is offline and the client asks for a region move off from that server, the master makes the region offline.

      Attachments

        1. 9480-1.txt
          1 kB
          Devaraj Das
        2. trunk-9480_v1.1.patch
          17 kB
          Jimmy Xiang
        3. trunk-9480_v1.2.patch
          19 kB
          Jimmy Xiang
        4. trunk-9480_v2.patch
          22 kB
          Jimmy Xiang
        5. trunk-9480.patch
          6 kB
          Jimmy Xiang

        Activity

          People

            jxiang Jimmy Xiang
            ddas Devaraj Das
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: