Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-20671

Merged region brought back to life causing RS to be killed by Master

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Cannot Reproduce
    • Affects Version/s: 2.0.0
    • Fix Version/s: None
    • Component/s: amv2
    • Labels:
      None

      Description

      Another bug coming out of a master restart and replay of the pv2 logs.

      The master merged two regions into one successfully, was restarted, but then ended up assigning the children region back out to the cluster. There is a log message which appears to indicate that RegionStates acknowledges that it doesn't know what this region is as it's replaying the pv2 WAL; however, it incorrectly assumes that the region is just OFFLINE and needs to be assigned.

      2018-05-30 04:26:00,055 INFO  [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=20000] master.HMaster: Client=hrt_qa//172.27.85.11 Merge regions a7dd6606dcacc9daf085fc9fa2aecc0c and 4017a3c778551d4d258c785d455f9c0b
      2018-05-30 04:28:27,525 DEBUG [master/ctr-e138-1518143905142-336066-01-000003:20000] procedure2.ProcedureExecutor: Completed pid=4368, state=SUCCESS; MergeTableRegionsProcedure table=tabletwo_merge, regions=[a7dd6606dcacc9daf085fc9fa2aecc0c, 4017a3c778551d4d258c785d455f9c0b], forcibly=false
      
      2018-05-30 04:29:20,263 INFO  [master/ctr-e138-1518143905142-336066-01-000003:20000] assignment.AssignmentManager: a7dd6606dcacc9daf085fc9fa2aecc0c regionState=null; presuming OFFLINE
      2018-05-30 04:29:20,263 INFO  [master/ctr-e138-1518143905142-336066-01-000003:20000] assignment.RegionStates: Added to offline, CURRENTLY NEVER CLEARED!!! rit=OFFLINE, location=null, table=tabletwo_merge, region=a7dd6606dcacc9daf085fc9fa2aecc0c
      2018-05-30 04:29:20,266 INFO  [master/ctr-e138-1518143905142-336066-01-000003:20000] assignment.AssignmentManager: 4017a3c778551d4d258c785d455f9c0b regionState=null; presuming OFFLINE
      2018-05-30 04:29:20,266 INFO  [master/ctr-e138-1518143905142-336066-01-000003:20000] assignment.RegionStates: Added to offline, CURRENTLY NEVER CLEARED!!! rit=OFFLINE, location=null, table=tabletwo_merge, region=4017a3c778551d4d258c785d455f9c0b
      

      Eventually, the RS reports in its online regions, and the master tells it to kill itself:

      2018-05-30 04:29:24,272 WARN  [RpcServer.default.FPBQ.Fifo.handler=26,queue=2,port=20000] assignment.AssignmentManager: Killing ctr-e138-1518143905142-336066-01-000002.hwx.site,16020,1527654546619: Not online: tabletwo_merge,,1527652130538.a7dd6606dcacc9daf085fc9fa2aecc0c.
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                elserj Josh Elser
                Reporter:
                elserj Josh Elser
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: