Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-2918

TaildirSource is underperforming with huge parent directories

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.6.0
    • 1.7.0
    • Sinks+Sources
    • introduced an option in flume configuration for TAILDIR source to cache pattern matched files for huge directories
    • Patch

    Description

      TailDir source cause high cpu utilization, when large amount of file is sitting in the target directory. File pattern matches only a single file, but the parent directory contains about 50,000 other file.

      Attachments

        1. profiling_before.png
          515 kB
          Attila Simon
        2. profiling_after.png
          183 kB
          Attila Simon
        3. perftest.png
          311 kB
          Attila Simon
        4. PerfHugeDir.java
          6 kB
          Attila Simon
        5. test.csv
          18 kB
          Attila Simon
        6. FLUME-2918-2.patch
          31 kB
          Attila Simon

        Issue Links

          Activity

            People

              sati Attila Simon
              sati Attila Simon
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: