Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-3368

Split message can come in before region opened message; results in 'Region has been PENDING_CLOSE for too long' cycle

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Duplicate
    • None
    • 0.92.0
    • None
    • None
    • Reviewed

    Description

      Another good one. Look at these excerpts from master log:

      2010-12-16 00:49:45,749 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: TestTable,0078922610,1292373363753.490b382bae33642d12cd717b5785698b.: Daughters; TestTable,0078922610,1292460584999.c8b95dfc9a671083bafdaa0341279777., TestTable,0078933586,  
      1292460584999.7cc636c9a7274eec4e784df2efebbca3. from XXX185,60020,1292460570976
      ....
      2010-12-16 00:49:46,132 DEBUG org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Opened region TestTable,0078922610,1292373363753.490b382bae33642d12cd717b5785698b. on XXX185,60020,1292460570976
      

      ... so the split will have cleared the parent from in-memory data structures and then the open handler will add them back (though region is offlined, split).

      Then the balancer runs....... only no one is holding the region thats being balanced.

      Over on XXX185 I see the open and then split at these times:

      2010-12-16 00:49:43,740 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opened TestTable,0078922610,1292373363753.490b382bae33642d12cd717b5785698b.
      .....
      2010-12-16 00:49:45,003 INFO org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of region TestTable,0078922610,1292373363753.490b382bae33642d12cd717b5785698b.
      
      

      So, the fact that it takes the Master a while to get around to the zk watcher processing messes us up.

      Root problem is that we're using two different message buses, zk and then heartbeat. Intent is to do all over zk and remove hearbeat but looking at what to do for 0.90.0.

      Attachments

        1. 3368-v2.txt
          18 kB
          Michael Stack
        2. 3368.txt
          15 kB
          Michael Stack

        Issue Links

          Activity

            People

              stack Michael Stack
              stack Michael Stack
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: