Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Previously date range is used to 1) cut data from source; 2) mark min/max data time for segment pruning. However for streaming source, these two concepts are separate. E.g. offset is used to cut data from Kafka; and the min/max data time of segments can overlap due to late coming records.
Will add two more attributes in CubeSegment, sourceOffsetStart and sourceOffsetEnd. To be backward compatible, when the two attributes are missing (equals to 0), dateRangeStart and dateRangeEnd will serve as source offsets.