I think the general pattern that all transitions need to follow is:
1) HLog that the RS intends to do some operation
2) Perform the operation in a way that is still undoable (eg create compacted HFile but don't yet remove old ones)
3) HLog that the RS has finished the action
4) Clean up from part 2 (eg remove the pre-compaction HFiles)
- Whenever a RS has failed, the master will open its HLog for append.
- This steals the write lease and increases the generation stamp on its last block.
- Thus the next time the RS attempts to hflush(), it will receive an IOException (I think a LeaseExpiredException to be specific?)
Failure cases at each step:
Fail before 1) no problem, data isn't touched
Fail after 1 but before 3) the transition is an indeterminate state. When the master recovers, it can roll back to the pre-transition state
Fail after 3) when the master recovers, it can complete the "cleanup" transition for the regionserver (even if the regionserver got halfway through cleanup)
This pattern relies on cleanup being idempotent, and state transitions being undoable.
The above examples are for the compaction case, but I think the same general ideas apply elsewhere.