Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-808

Source/sink.close() is not invoked by DirectDriver if source.next() is blocking.

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: 0.9.4
    • Fix Version/s: 0.9.5
    • Component/s: Node, Sinks+Sources
    • Labels:
      None

      Description

      I just noticed an issue that DirectDriver doesn't really call the sink/source.close() at all, if flume node gets reconfigured.

      Very easy to reproduce:
      1. start a master, and node at the same machine
      2. configure the node thru master, with any configure, e.g.: tail("/tmp/aaa"), null
      3. reconfigure the node thru master: collectorSource(12345), null

      The source.close() didn't get called at all for reconfigure, or rolled. And later on, Driver will decide to interrupt the thread since it seems not be able to be closed gracefully.

      2011-10-19 22:46:03,646 [Heartbeat] WARN agent.LivenessManager: Heartbeats are backing up, currently behind by 1 heartbeats
      2011-10-19 22:46:08,648 [Heartbeat] WARN agent.LivenessManager: Heartbeats are backing up, currently behind by 2 heartbeats
      2011-10-19 22:46:13,650 [Heartbeat] WARN agent.LivenessManager: Heartbeats are backing up, currently behind by 3 heartbeats
      2011-10-19 22:46:18,652 [Heartbeat] WARN agent.LivenessManager: Heartbeats are backing up, currently behind by 4 heartbeats
      2011-10-19 22:46:23,648 [Check config] ERROR agent.LogicalNode: Forcing driver to exit uncleanly
      2011-10-19 22:46:23,648 [logicalNode c1-19] ERROR connector.DirectDriver: Closing down due to exception during append calls
      2011-10-19 22:46:23,648 [Check config] INFO agent.LogicalNode: Node config successfully set to com.cloudera.flume.conf.FlumeConfigData@39a2f02e
      2011-10-19 22:46:23,648 [logicalNode c1-19] INFO connector.DirectDriver: Connector logicalNode c1-19 exited with error: Waiting for queue element was interrupted! null
      

      It may be brought by Flume-596 which is a big refactor of Driver. I tried to git-reset to 21b74010c34cef9a977c75ab5dec4dc747d8f5aa, and cannot reproduce the problem.

      Expected result – source should be close()'ed

      2011-10-19 23:20:45,811 [SpawningLogicalNode c1] INFO collector.CollectorSource: closed
      2011-10-19 23:20:45,812 [SpawningLogicalNode c1] INFO thrift.ThriftEventSource: Closed server on port 35853...
      2011-10-19 23:20:45,817 [SpawningLogicalNode c1] INFO thrift.ThriftEventSource: Queue still has 0 elements ...
      2011-10-19 23:20:45,852 [logicalNode c1-20] INFO collector.CollectorSource: closed
      2011-10-19 23:20:45,852 [logicalNode c1-20] INFO thrift.ThriftEventSource: Closed server on port 35853...
      2011-10-19 23:20:45,853 [logicalNode c1-20] INFO thrift.ThriftEventSource: Queue still has 0 elements ...
      2011-10-19 23:20:45,853 [logicalNode c1-20] INFO collector.CollectorSource: closed
      2011-10-19 23:20:45,853 [logicalNode c1-20] INFO thrift.ThriftEventSource: Closed server on port 35853...
      2011-10-19 23:20:45,853 [logicalNode c1-20] INFO thrift.ThriftEventSource: Queue still has 0 elements ...
      2011-10-19 23:20:45,853 [logicalNode c1-20] INFO agent.LogicalNode: c1: Connector stopped: CollectorSource | NullSink
      2011-10-19 23:20:45,853 [SpawningLogicalNode c1] INFO agent.LogicalNode: Node config successfully set to com.cloudera.flume.conf.FlumeConfigData@2682d210
      2011-10-19 23:20:45,864 [logicalNode c1-23] INFO agent.LogicalNode: Connector started: TailSource | NullSink
      
      

      It might be the root cause of FLUME-798, and related to all the recent Interrupted exception discussion on user@.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              mingjielai Mingjie Lai
            • Votes:
              2 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: