Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-374

Move to java CRC32 implementation

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 0.8.0
    • None
    • core

    Description

      We keep a per-record crc32. This is fairly cheap algorithm, but the java implementation uses JNI and it seems to be a bit expensive for small records. I have seen this before in Kafka profiles, and I noticed it on another application I was working on. Basically with small records the native implementation can only checksum < 100MB/sec. Hadoop has done some analysis of this and replaced it with a Java implementation that is 2x faster for large values and 5-10x faster for small values. Details are here HADOOP-6148.

      We should do a quick read/write benchmark on log and message set iteration and see if this improves things.

      Attachments

        1. KAFKA-374.patch
          33 kB
          David Arthur
        2. KAFKA-374-draft.patch
          33 kB
          Jay Kreps

        Activity

          People

            jkreps Jay Kreps
            jkreps Jay Kreps
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: