1. Kafka
  2. KAFKA-77

Implement "group commit" for kafka logs


    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 0.7
    • Fix Version/s: 0.8.0
    • Component/s: None
    • Labels:


      The most expensive operation for the server is usually going to be the fsync() call to sync data in a log to disk, if you don't flush your data is at greater risk of being lost in a crash. Currently we give two knobs to tune this trade--log.flush.interval and log.default.flush.interval.ms (no idea why one has default and the other doesn't since they are both defaults). However if you flush frequently, say on every write, then performance is not that great.

      One trick that can be used to improve this worst case of continual flushes is to allow a single fsync() to be used for multiple writes that occur at the same time. This is a lot like "group commit" in databases. It is unclear which cases this would improve and by how much but it might be worth a try.


        Jay Kreps created issue -
        Jay Kreps made changes -
        Field Original Value New Value
        Attachment kafka-group-commit.patch [ 12488315 ]
        Alan Cabrera made changes -
        Workflow jira [ 12624060 ] no-reopen-closed, patch-avail [ 12626219 ]
        Jay Kreps made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Won't Fix [ 2 ]
        Tony Stevenson made changes -
        Workflow no-reopen-closed, patch-avail [ 12626219 ] Apache Kafka Workflow [ 13052777 ]
        Tony Stevenson made changes -
        Workflow Apache Kafka Workflow [ 13052777 ] no-reopen-closed, patch-avail [ 13055405 ]


          • Assignee:
            Jay Kreps
            Jay Kreps
          • Votes:
            0 Vote for this issue
            3 Start watching this issue


            • Created: