Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-3159

Double play of OpenedRegionHandler for a single region; fails second time through and aborts Master

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.90.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Here is master log with annotations: http://people.apache.org/~stack/master.txt

      Region in question is:

      b8827a67a9d446f345095d25e1f375f7

      The running code is doctored in that I've added in a bit of logging – zk in particular – and I've also removed what I thought was a provocation of this condition, reassign inside in an assign if server has gone away when we try the open rpc (Turns out we have the condition even w/o this code in place).

      The log starts where the region in question timesout in RIT.

      We assign it to 186.

      Notice how we see 'Handling transition' for this region TWICE. This means two OpenedRegionHandlers will be scheduled – and so the failure to delete a znode already gone.

      As best I can tell, the watcher for this region is triggered once only – which is odd because how then the double scheduling of OpenedRegionHandler but also, why am I not seeing OPENING, OPENING, OPENED and only what I presume is an OPENED?

        Attachments

        1. hbase-meta-dupe-opened-master-only.txt
          8 kB
          Jonathan Gray
        2. hbase-meta-dupe-opened.txt
          15 kB
          Jonathan Gray
        3. TestRollingRestart-v4.patch
          25 kB
          Jonathan Gray
        4. master-root-assign-abort.log
          15 kB
          Jonathan Gray
        5. rs_death_on_meta_open_no_root.txt
          14 kB
          Jonathan Gray
        6. HBASE-3159-FINAL.patch
          30 kB
          Jonathan Gray

          Activity

            People

            • Assignee:
              streamy Jonathan Gray
              Reporter:
              stack stack
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: