Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Duplicate
-
None
-
None
-
None
-
None
Description
In case if timestamps for KeyValues specified differently for different column families, then TIMERANGEs of both HFiles would be wrong.
Example (in pseudo code): my reducer has a condition
if ( condition ) {
keyValue = new KeyValue(.., CF1, .., timestamp, ..);
} else {
keyValue = new KeyValue(.., CF2, .., ..); // <- no timestamp
}
context.write( keyValue );
These two keyValues would be written into two different HFiles.
But the code, which is actually write do the following:
// we now have the proper HLog writer. full steam ahead
kv.updateLatestStamp(this.now);
trt.includeTimestamp(kv);
wl.writer.append(kv);
Basically, two HFiles shares the same instance of trt (TimeRangeTracker), which leads to the same TIMERANGEs of both of them. Which is definitely incorrect, because first HFile must have TIMERANGE=timestamp...timestamp, cause we do not write any other timestamps there. And another HFile must have TIMERANGE=now...now by same meaning.