Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-3672

Introduce globally consistent checkpoint in Kafka Streams

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.10.0.0
    • None
    • streams

    Description

      This is originate from the idea of rethinking about the checkpoint file creation condition:

      Today the checkpoint file containing the checkpointed offsets is written upon stream task clean shutdown, and is read and deleted upon stream task (re-)construction. The rationale is that if upon task re-construction, the checkpoint file is missing, it indicates that the underlying persistent state store (rocksDB, for example)'s state may not be consistent with the committed offsets, and hence we'd better to wipe-out the maybe-broken state storage and rebuild from the beginning of the offset.

      However, we may able to do better than this setting if we can fully control the persistent store flushing time to be aligned with committing, and hence as long as we commit, we are always guaranteed to get a clear checkpoint.

      This may be generalized to a "global state checkpoint" mechanism in Kafka Streams, which may also subsume KAFKA-3184 for non persistent stores.

      Attachments

        Activity

          People

            Unassigned Unassigned
            guozhang Guozhang Wang
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated: