Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-6070

AM.nodeDeleted and SSH races creating problems for regions under SPLIT

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.92.1, 0.94.0
    • 0.94.1, 0.95.0
    • None
    • None
    • Reviewed

    Description

      We tried to address the problems in Master restart and RS restart while SPLIT region is in progress as part of HBASE-5806.
      While doing some more we found still there is one race condition.
      -> Split has just started and the znode is in RS_SPLIT state.
      -> RS goes down.
      -> First call back for SSH comes.
      -> As part of the fix for HBASE-5806 SSH knows that some region is in RIT.
      -> But now nodeDeleted event comes for the SPLIt node and there we try to delete the RIT.
      -> After this we try to see in the SSH whether any node is in RIT. As we dont find the region in RIT the region is never assigned.

      When we fixed HBASE-5806 step 6 happened first and then step 5 happened. So we missed it. Now we found that. Will come up with a patch shortly.

      Attachments

        1. HBASE-6070_trunk.patch
          10 kB
          ramkrishna.s.vasudevan
        2. HBASE-6070_trunk_1.patch
          10 kB
          ramkrishna.s.vasudevan
        3. HBASE-6070_0.94.patch
          9 kB
          ramkrishna.s.vasudevan
        4. HBASE-6070_0.94_1.patch
          9 kB
          ramkrishna.s.vasudevan
        5. HBASE-6070_0.92.patch
          3 kB
          ramkrishna.s.vasudevan
        6. HBASE-6070_0.92_1.patch
          3 kB
          ramkrishna.s.vasudevan

        Activity

          People

            ram_krish ramkrishna.s.vasudevan
            ram_krish ramkrishna.s.vasudevan
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: