Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
hbase-11339
-
None
-
Reviewed
Description
In major compaction, the mvcc of cells ( whose mvcc<=readPt) are set to 0.
In some cases, this brings issues, for example the following scenario:
- We have mob enabled cf, the threshold is 5 bytes.
- Add a cell (r0,ts0,seqId=5,"mobValue0"), and flush it to a mob file.
- Add another cell (r0,ts0,seqId=10,"new"), and flush the memstore, this is not a mob cell since it's value is smaller than 5 bytes.
- Add the third cell (r1:ts1:seqId =15, "mobValue1"), and flush it to a mob file. Now we have two mob files.
- Now run a major compaction in hfiles, we got two cells left (r0:ts0:seqId=0,"new") and (r1:ts1:seqId=0,'mobValue1").
- Now run a mob major compaction, two mob files are merged into one. The update ref cell is bulk loaded back to hbase, they are (r0,ts0,seqId=5,"mobValue0") and (r1:ts1:seqId=0,"mobValue1").
- Now open a scanner, the value of r0 is mobValue0 whereas the correct value new.
This issue is caused by the mvcc reset in compactions. We should disable it in compactions for mob-enabled columns.
Hi Jonathan Hsieh, Anoop Sam John, ramkrishna.s.vasudevan, would you mind reviewing the patch. thanks a lot!