Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-7072

Kafka Streams may drop rocksb window segments before they expire


    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s:,,, 1.0.0, 1.0.1, 1.1.0, 2.0.0
    • Fix Version/s: 2.1.0
    • Component/s: streams
    • Labels:


      The current implementation of Segments used by Rocks Session and Time window stores is in conflict with our current timestamp management model.

      The current segmentation approach allows configuration of a fixed number of segments (let's say 4) and a fixed retention time. We essentially divide up the retention time into the available number of segments:

         expiration date                 right now
                ------retention time-------/
                |  seg 0  |  seg 1  |  seg 2  |  seg 3  |

      Note that we keep one extra segment so that we can record new events, while some events in seg 0 are actually expired (but we only drop whole segments, so they just get to hang around.

             expiration date                 right now
                    ------retention time-------/
                |  seg 0  |  seg 1  |  seg 2  |  seg 3  |

      When it's time to provision segment 4, we know that segment 0 is completely expired, so we drop it:

                   expiration date                 right now
                          ------retention time-------/
                          |  seg 1  |  seg 2  |  seg 3  |  seg 4  |


      However, the current timestamp management model allows for records from the future. Namely, because we define stream time as the minimum buffered timestamp (nondecreasing), we can have a buffer like this: [ 5, 2, 6 ], and our stream time will be 2, but we'll handle a record with timestamp 5 next. referring to the example, this means we could wind up having to provision segment 4 before segment 0 expires!

      Let's say "f" is our future event:

                   expiration date                 right now
                          ------retention time-------/
                          |  seg 1  |  seg 2  |  seg 3  |  seg 4  |

      Should we drop segment 0 prematurely? Or should we crash and refuse to process "f"?

      Today, we do the former, and this is probably the better choice. If we refuse to process "f", then we cannot make progress ever again.

      Dropping segment 0 prematurely is a bummer, but users could also set the retention time high enough that they don't think they'll actually get any events late enough to need segment 0. Worst case, since we can have many future events without advancing stream time, sparse enough to each require their own segment, which would eat deeply into the retention time, dropping many segments that should be live.


          Issue Links



              • Assignee:
                vvcephei John Roesler
                vvcephei John Roesler
              • Votes:
                0 Vote for this issue
                4 Start watching this issue


                • Created: