3 files with-
- 1 with a key, k, with timestamp 0, value 3
- 1 with a delete of k with timestamp 1
- 1 with k with timestamp 2, value 2
The column of k has a summing combiner set on it. The issue here is that depending on how the major compactions play out, differing values with result. If all 3 files compact, the correct value of 2 will result. However, if 1 & 3 compact first, they will aggregate to 5. And then the delete will fall after the combined value, resulting in the result 5 to persist.
First and foremost, this should be documented. I think to remedy this, combiners should only be used on full MajC, not not full ones. This may necessitate a special flag or a new combiner that implemented the proper semantics.