There's currently a bug in the way we handle tablet copies while replacing existing tombstoned tablets:
- a tablet exists in TABLET_DATA_TOMBSTONED state
- we begin copying a new replica on top of this one
- this calls TabletMetadata::ReplaceSuperBlock() using the remote superblock (importantly, this remote superblock contains remote block IDs)
- we crash mid-copy
- on restart, we see the "TABLET_DATA_COPYING" state and "roll forward" the deletion of this tablet. However the block IDs here are the IDs from the remote machine, and we incorrectly delete a bunch of blocks.
This has always been an issue, but was made worse in 0.10 by the fix for
KUDU-1538. After fixing KUDU-1538, the likelihood of a remote block ID matching a local one is quite high, whereas before we'd usually not see this bug.