Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-2801

Performance improvement on TailDir source

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.7.0
    • Fix Version/s: 1.7.0
    • Component/s: Sinks+Sources
    • Labels:
      None
    • Flags:
      Patch

      Description

      This a proposal of performance improvement for new tailing source FLUME-2498.
      Taildir source reads a file by 1byte, so the performance is very low compared to tailing on exec source.
      I tested lot's of ways to improve performance and implemented the best one.

      Changes.

      • Reading a file by a 8k block instead of 1 byte.
      • Use byte[] for handling data instead of ByteArrayDataOutput/ByteBuffer(direct)/.. for better performance.
      • Don't convert byte[] to string and vice verse.

      Simple file reading test results.

      File size: 100 MB,
      Line size: 500 byte

      Estimated time to read the file:

      Reading 1byte(Using the code in Taildir) 32544 ms
      Reading 8K Block 431 ms

      Testing on flume, it catches up the performance of tailing on exec source. (30x performance boost)

        Attachments

        1. FLUME-2801.patch
          9 kB
          Jun Seok Hong
        2. FLUME-2801-1.patch
          9 kB
          Jun Seok Hong
        3. FLUME-2801-2.patch
          9 kB
          Satoshi Iijima

          Activity

            People

            • Assignee:
              siefried12 Jun Seok Hong
              Reporter:
              siefried12 Jun Seok Hong
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: