Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-77

Implement "group commit" for kafka logs

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • 0.7
    • 0.8.0
    • None
    • None

    Description

      The most expensive operation for the server is usually going to be the fsync() call to sync data in a log to disk, if you don't flush your data is at greater risk of being lost in a crash. Currently we give two knobs to tune this trade--log.flush.interval and log.default.flush.interval.ms (no idea why one has default and the other doesn't since they are both defaults). However if you flush frequently, say on every write, then performance is not that great.

      One trick that can be used to improve this worst case of continual flushes is to allow a single fsync() to be used for multiple writes that occur at the same time. This is a lot like "group commit" in databases. It is unclear which cases this would improve and by how much but it might be worth a try.

      Attachments

        1. kafka-group-commit.patch
          6 kB
          Jay Kreps

        Activity

          People

            jkreps Jay Kreps
            jkreps Jay Kreps
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: