This is on branch-1 based code only.
Here's the sequence of events.
- A regionserver opens a new region. That regions looks like it should split.
- So the regionserver starts a split transaction.
- Split transaction starts execute
- Split transaction encounters an error in stepsBeforePONR
- Split transaction starts rollback
- Split transaction notifies master that it's rolling back using HMasterRpcServices#reportRegionStateTransition
- AssignmentManager#onRegionTransition is called with SPLIT_REVERTED
- AssignmentManager#onRegionSplitReverted is called.
- That onlines the parent region and offlines the daughter regions.
However the daughter regions were never created in meta so all that gets done is that state for those rows gets OFFLINE. Now all clients trying to get the parent instead get the offline daughter.