Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
MergeRegionProcedure seems incomplete. The ProcedureExecutor framework can run in a test mode such that it kills the Procedure before it can persist state and it does this repeatedly to shake out areas where Procedures may not be preserving all needed state at each Procedural step. The kill will cause the Procedure to 'fail'. It'll then run the rollback procedure. The MergeRegionProcedure is not able to roll back the last few steps of Merge.... It throws an UnsupportedException (the hope was that the missing steps would be filled in ... but they are hard to complete in that they themselves are stepped).
So....
Well it turns out that Split has a mechanism where it will not fail the Procedure if gets to a stage from which it cannot rollback. Instead, it will just retry and keep retrying till it succeeds.... eventually. Merge has this facility half-implemented. Merge tests are therefore flakey. They do stuff like this:
2018-02-17 04:04:02,999 WARN [PEWorker-1] assignment.MergeTableRegionsProcedure(311): Failed rollback attempt step MERGE_TABLE_REGIONS_UPDATE_META for merging the regions [485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c] in table testRollbackAndDoubleExecution java.lang.UnsupportedOperationException: pid=44, state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: abort requested; MergeTableRegionsProcedure table=testRollbackAndDoubleExecution, regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c], forcibly=false unhandled state=MERGE_TABLE_REGIONS_UPDATE_META at org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:291) at org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:78) at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:199) at org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:859) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1356) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1312) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1181) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1734) 2018-02-17 04:04:03,007 ERROR [PEWorker-1] helpers.MarkerIgnoringBase(159): CODE-BUG: Uncaught runtime exception for pid=44, state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: abort requested; MergeTableRegionsProcedure table=testRollbackAndDoubleExecution, regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c], forcibly=false java.lang.UnsupportedOperationException: pid=44, state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: abort requested; MergeTableRegionsProcedure table=testRollbackAndDoubleExecution, regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c], forcibly=false unhandled state=MERGE_TABLE_REGIONS_UPDATE_META at org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:291) at org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:78) at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:199) at org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:859) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1356) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1312) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1181) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1734)
i.e. throw up their hands which makes for a CODE-BUG... a condition the framework can not process.... The test fails.
Attachments
Attachments
Issue Links
- links to