We recently had a production issue in which HFiles that were still in use by a table were deleted. This appears to have been caused by race conditions in the order in which HFileLinks are created, combined with the fact that only files younger than hbase.master.hfilecleaner.ttl are kept alive.
This is how to reproduce:
- Clone a large snapshot into a new table. The clone operation must take more than hbase.master.hfilecleaner.ttl time to guarantee data loss.
- Ensure that no other table or snapshot is referencing the HFiles used by the new table.
- Delete the snapshot. This breaks the table.
The main cause is this:
- Cloning a snapshot creates the table in the HBASE_TEMP_DIRECTORY.
- However, it immediately creates back references to the HFileLinks that it creates for the table in the archive directory.
- HFileLinkCleaner does not check the HBASE_TEMP_DIRECTORY, so it considers all those back references deletable.
- The only thing that keeps them alive is the TimeToLiveHFileCleaner, but only for 5 minutes.
- So if cloning the snapshot takes more than 5 minutes, and the HFiles aren't referenced by anything else, data loss is guaranteed.
I have a unit test reproducing the issue and I tried to fix this, but didn't completely succeed. I will attach the patch shortly.
- Don't delete any snapshots that you cloned into a table (we used this successfully-- we actually restored the deleted snapshot from backup using ExportSnapshot after the data loss happened, which successfully reversed the data loss).
- Manually check the back references and create any missing ones after cloning a snapshot.
- Increase hbase.master.hfilecleaner.ttl. (untested)