Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-132

Reduce Unnecessary Filesystem Writes for Logs Without Unflushed Events

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 0.6
    • 0.7
    • core
    • None
    • Linux 2.6.38-10-server (Ubuntu 10.04.3 LTS), XFS Filesystem

    Description

      We noticed a fair amount of stress on our filesystem in an environment with a large number of topics but low message activity. After some investigation, we realized that a short log.flush.interval coupled with a large number of topics resulted in a lot of unnecessary disk activity, even without events to be written.

      This activity occurs because FileChannel.force(true) is called on the underlying FileMessageSet for each log, even when there are no messages to be written. This call forces unproductive writes to the underlying filesystem.

      This case is especially stressed in an environment with a large number of low-activity topics for which low latency is still important. Here is the before-and-after output of `iostat -x 2` on a system with 1044 topics and a timed flush interval of 100ms. Note the reduction in %util and writes/second. In the "before" output, we see 40-80% util and ~260 writes/second. In the "after" output, we see 10-15% util and ~65 writes/second.

      Pre-patch output: https://raw.github.com/gist/54d0f4c62753a6e2de1f/7ee1982bfa8e5c088bcf9ba953f01956443bd31e/iostat-pre-kafka-patch.txt

      Post-patch output:
      https://raw.github.com/gist/54d0f4c62753a6e2de1f/b939973c7fed642480856d9bdeb2e4cb0ada445b/iostat-post-kafka-patch.txt

      The proposed patch (see attached) skips calling the underlying FileMessageSet flush operation if the log's atomic counter indicates that there are no messages to be written.

      Attachments

        1. patch.txt
          0.4 kB
          C. Scott Andreas

        Activity

          People

            Unassigned Unassigned
            cscotta C. Scott Andreas
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 1h
                1h
                Remaining:
                Remaining Estimate - 1h
                1h
                Logged:
                Time Spent - Not Specified
                Not Specified