Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-450

Netty can cause error on clean shutdown of worker

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 0.9.2-incubating, 0.9.0.1, 0.9.3
    • None
    • storm-core
    • None

    Description

      We recently had an issue where a worker process was shutdown cleaning on 0.9.0. The reason the worker shutdown cleanly is not the issue here, but it caused a cascading failure that made a connected worker shutdown too. This is going to be even more problematic in newer versions of storm when we give the worker time to shutdown cleanly instead of just shooting it with a kill -9

      Ideally the client should continue to try and reconnect, because the worker may have exited on its own and will be re-spawned shortly. If it is rescheduled elsewhere the worker will eventually detect it and reroute things accordingly. This is what happens already when the connection is just closed. There really is no reason to have one side know when the other side is shutting down.

      2014-08-11 19:00:17 b.s.util [ERROR] Async loop died!
      java.lang.RuntimeException: java.lang.RuntimeException: Client is being closed, and does not take requests any more
      	at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:130) ~[storm-core-0.9.0-wip21.jar:na]
      	at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:101) ~[storm-core-0.9.0-wip21.jar:na]
      	at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:62) ~[storm-core-0.9.0-wip21.jar:na]
      	at backtype.storm.disruptor$consume_loop_STAR_$fn__1999.invoke(disruptor.clj:74) ~[storm-core-0.9.0-wip21.jar:na]
      	at backtype.storm.util$async_loop$fn__421.invoke(util.clj:400) ~[storm-core-0.9.0-wip21.jar:na]
      	at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na]
      	at java.lang.Thread.run(Thread.java:722) [na:1.7.0_17]
      Caused by: java.lang.RuntimeException: Client is being closed, and does not take requests any more
      	at backtype.storm.messaging.netty.Client.send(Client.java:118) ~[storm-netty-0.9.0-wip21.jar:na]
      	at backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4922$fn__4923.invoke(worker.clj:342) ~[storm-core-0.9.0-wip21.jar:na]
      	at backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4922.invoke(worker.clj:331) ~[storm-core-0.9.0-wip21.jar:na]
      	at backtype.storm.disruptor$clojure_handler$reify__1986.onEvent(disruptor.clj:43) ~[storm-core-0.9.0-wip21.jar:na]
      	at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:127) ~[storm-core-0.9.0-wip21.jar:na]
      	... 6 common frames omitted
      2014-08-11 19:00:17 b.s.util [INFO] Halting process: ("Async loop died!")
      

      Attachments

        Issue Links

          Activity

            People

              revans2 Robert Joseph Evans
              revans2 Robert Joseph Evans
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: