Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-264

exec high CPU usage

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 0.9.1
    • 0.9.2
    • Sinks+Sources
    • None
    • centos 5, dual e5410, 8gb ram, sun java 1.6.0 and openjdk 1.6.0

    Description

      execStream is using ~200% CPU watching a tail -f. Strace shows lots of output similar to this:
      [pid 8722] <... read resumed> 0x2aaaf862d000, 32) = -1 EAGAIN (Resource temporarily unavailable)
      [pid 8723] <... read resumed> 0xea18000, 32) = -1 EAGAIN (Resource temporarily unavailable)
      [pid 8722] read(43, <unfinished ...>
      [pid 8723] read(49, 0xea18000, 32) = -1 EAGAIN (Resource temporarily unavailable)
      [pid 8722] <... read resumed> 0x2aaaf862d000, 32) = -1 EAGAIN (Resource temporarily unavailable)
      [pid 8723] read(49, <unfinished ...>
      [pid 8722] read(43, <unfinished ...>

      This is interspersed with reads and writes that have the actual content of the Apache access log file. The source is generating ~40-50 events per second. This is running on OpenJDK 1.6.0-b09 on CentOS 5. The flume build has the patches from FLUME-234 and FLUME-218 applied.

      The node config is this:
      node: execStream( "tail -f /var/log/httpd/access.log" ) |
      agentE2EChain( "collector1", "collector2", "collector3" )
      Collectors are dumping to hdfs, with minimal CPU usage on the collector.

      I have since tested on another machine with lower log volume and the sun jdk, which exhibits the same behavior. Also interesting to note was this machine had some events left over from when it was set up using tail, and while it sent those events to the collector CPU usage jumped to ~350% (it was ~150% when started tailing a large existing file) so the exec source seems to be pegging two cores on its own.

      From Jonathan Hsieh on the user mailing list:
      The implementation of the exec source and how it shovels the program's
      stdout is probably inefficient and polling too often or using chunks that
      are too small.

      Attachments

        Activity

          People

            jmhsieh Jonathan Hsieh
            flume_danieltm Disabled imported user
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: