Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-2801

Performance improvement on TailDir source

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.7.0
    • 1.7.0
    • Sinks+Sources
    • None
    • Patch

    Description

      This a proposal of performance improvement for new tailing source FLUME-2498.
      Taildir source reads a file by 1byte, so the performance is very low compared to tailing on exec source.
      I tested lot's of ways to improve performance and implemented the best one.

      Changes.

      • Reading a file by a 8k block instead of 1 byte.
      • Use byte[] for handling data instead of ByteArrayDataOutput/ByteBuffer(direct)/.. for better performance.
      • Don't convert byte[] to string and vice verse.

      Simple file reading test results.

      File size: 100 MB,
      Line size: 500 byte

      Estimated time to read the file:

      Reading 1byte(Using the code in Taildir) 32544 ms
      Reading 8K Block 431 ms

      Testing on flume, it catches up the performance of tailing on exec source. (30x performance boost)

      Attachments

        1. FLUME-2801-2.patch
          9 kB
          Satoshi Iijima
        2. FLUME-2801-1.patch
          9 kB
          Jun Seok Hong
        3. FLUME-2801.patch
          9 kB
          Jun Seok Hong

        Activity

          People

            siefried12 Jun Seok Hong
            siefried12 Jun Seok Hong
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: