Description
This is a follow-on from HBASE-21083 which added the 'bypass' functionality. On bypass, there is more state to be cleared if we are allow new Procedures to be scheduled.
For example, here is a bypass:
2018-09-20 05:45:43,722 INFO org.apache.hadoop.hbase.procedure2.Procedure: pid=100449, state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true, bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, server=ve1233.halxg.cloudera.com,22101,1537397961664 bypassed, returning null to finish it 2018-09-20 05:45:44,022 INFO org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, server=ve1233.halxg.cloudera.com,22101,1537397961664 in 2mins, 7.618sec
... but then when I try to assign the bypassed region later, I get this:
2018-09-20 05:46:31,435 WARN org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: There is already another procedure running on this region this=pid=100450, state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, server=ve1233.halxg.cloudera.com,22101,1537397961664 pid=100450, state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16; rit=OPENING, location=ve1233.halxg.cloudera.com,22101,1537397961664 2018-09-20 05:46:31,510 INFO org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Rolled back pid=100450, state=ROLLEDBACK, exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via AssignProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: There is already another procedure running on this region this=pid=100450, state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, server=ve1233.halxg.cloudera.com,22101,1537397961664; AssignProcedure table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 exec-time=473msec
... which is a long-winded way of saying the Unassign Procedure still exists still in RegionStateNodes.
Attachments
Attachments
Issue Links
- links to