Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-2765

ThriftSource spaws too many threads

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.6.0
    • None
    • Sinks+Sources
    • None

    Description

      We are in the process of migrating from the old Flume to version 1.6. We are using the ThriftSource with the new KafkaSink. Here's what our config looks like:

      agent1.channels = ch1
      agent1.sources = thriftSrc
      agent1.sinks = kafka
      
      agent1.channels.ch1.type = memory
      agent1.channels.ch1.capacity = 10000
      agent1.channels.ch1.transactionCapacity = 500
      
      # THRIFT
      agent1.sources.thriftSrc.type = thrift
      agent1.sources.thriftSrc.channels = ch1
      agent1.sources.thriftSrc.bind = 0.0.0.0
      agent1.sources.thriftSrc.port = 4042
      agent1.sources.thriftSrc.threads = 150 # if we don't set this option, the source keeps creating more and more threads until all heap memory is used up and then it crashes
      
      # KAFKA
      agent1.sinks.kafka.channel = ch1
      agent1.sinks.kafka.type = org.apache.flume.sink.kafka.KafkaSink
      agent1.sinks.kafka.batchSize = 50
      agent1.sinks.kafka.brokerList = broker.example.com:9092
      agent1.sinks.kafka.requiredAcks = 1
      agent1.sinks.kafka.topic = topic1
      

      We have been noticing some bad behavior by the Thrift source/Thrift server using the JMX connection. If we don't restrict the number of threads, it spawns thousands of new threads, apparently one for every message it receives. These threads all have the name "Flume Thrift IPC Thread [number]" and according to the jvisualvm console they are always idle. At some point all of the JVM memory is used up through creating new threads and flume crashes with the following exception:

      12 Aug 2015 16:56:11,721 ERROR [Thread-1] (org.apache.thrift.server.TThreadedSelectorServer$SelectorThread.run:544)  - run() exiting due to uncaught error
      java.lang.OutOfMemoryError: unable to create new native thread
              at java.lang.Thread.start0(Native Method)
              at java.lang.Thread.start(Thread.java:714)
              at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949)
              at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1360)
              at org.apache.thrift.server.TThreadedSelectorServer.requestInvoke(TThreadedSelectorServer.java:310)
              at org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:209)
              at org.apache.thrift.server.TThreadedSelectorServer$SelectorThread.select(TThreadedSelectorServer.java:576)
              at org.apache.thrift.server.TThreadedSelectorServer$SelectorThread.run(TThreadedSelectorServer.java:536)
      

      When we set the option to restrict the number of threads, the server sticks to that number and runs smoothly, however it drops messages occasionally (may have a different cause).

      I am wondering whether this is a bug or in some way expected behavior? What are the best practices for using a ThriftSource? Are there further parameters to possibly tune (like channel.capacity)?

      Attachments

        1. thread-dump-flume-1.6.txt
          148 kB
          Tobias Heintz

        Activity

          People

            Unassigned Unassigned
            t.heintz Tobias Heintz
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: