Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-126

Add Lifecycle support API for worker nodes



    • New Feature
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 1.0.0
    • storm-core
    • None



      Storm is already used in variety of environments. It is important that Storm provides some form of "lifecycle" API specified at Topology builder level to be called when worker nodes start and stop.

      It is a very crucial functional piece that is missing from Storm. Many project have to integrate, for example, with various container-like frameworks like Spring or Google Guice that need to be started at stopped in a controlled fashion before worker nodes begin or finish their work.

      I think something like a WorkerContextListener interface with two methods: onStartup(SomeContextClass) and onShutdown(SomeContextClass) could go a very long way for allowing to user's to plugin various third-party libraries easily.

      Then, the TopologyBuilder needs to be modified to accept classes that implement this interface.
      SomeContextClass does not need to be much more than a Map for now. But it is important to have it as it allows propagation ofl information between those lifecycle context listeners.

      Nathan, it would interesting to hear your opinion.


      nathanmarz: I agree, this should be added to Storm. The lifecycle methods should be parameterized with which tasks are running in this worker.

      Additionally, I think lifecycle methods should be added for bolt/spouts in the context of workers. Sometimes there's some code you want to run for a spout/bolt within a worker only one time, regardless of how many tasks for that bolt are within the worker. Then individual tasks should be able to access that "global state" within the worker for that spout/bolt.

      kyrill007: Thank you, Nathan, I think it would be relatively simple to implement and would have big impact. Now we're forced to manage container initializations through lazy static fields. You'd love to see that code.

      nathanmarz: Yup, this should be fairly easy to implement. I encourage you to submit a patch for this.

      kyrill007: Oh, well, my Clojure is unfortunately too weak for this kind of work. But I am working on it... Any pointers as to where in Storm code workers are started and stopped?

      nathanmarz: Here's the function that's called to start a worker: https://github.com/nathanmarz/storm/blob/master/src/clj/backtype/storm/daemon/worker.clj#L315

      And here's the code in the same file that shuts down a worker:

      I think the interface for the lifecycle stuff should look something like this:

      interface WorkerHook extends Serializable {
      void start(Map conf, TopologyContext context, List taskIds);
      void shutdown();

      You'll need to add a definition for worker hooks into the topology definition Thrift structure:


      I think for the first go it's ok to make this a Java-only feature by adding something like "4: list<binary> worker_hooks" to the StormTopology structure (where the "binary" refers to a Java-serialized object).

      Then TopologyBuilder can have a simple "addWorkerHook" method that will serialize the object and add it to the Thrift struct.

      danehammer: I've started working on this. I've followed Nathan's proposed design, but I keep hitting snags with calls to ThriftTopologyUtils, now that there is an optional list on StormTopology.

      I would like to add some unit tests for what I change there, would it make more sense for those to be in Java instead of Clojure? If so, are there any strong preferences on what dependencies I add and how I go about adding Java unit tests to storm-core?

      nathanmarz: No... unit tests should remain in Clojure. You can run Java code in Clojure very easily. Here's a good example of testing Java code in Clojure: https://github.com/nathanmarz/storm/blob/master/storm-core/test/clj/backtype/storm/security/auth/AuthUtils_test.clj

      danehammer: For this design:

      interface WorkerHook extends Serializable {
      void start(Map conf, TopologyContext context, List taskIds);
      void shutdown();

      When mk-worker has the worker hooks and goes to call start on them, where should it get the topology context? I started with what was returned by (worker-context worker), which returns a WorkerTopologyContext, but that doesn't appear to be correct.

      danehammer: If this helps, here's what I'm looking at right now: https://github.com/danehammer/storm/blob/issue-155/storm-core/src/clj/backtype/storm/daemon/worker.clj#L329


        Issue Links



              BaconSeason Michael Schonfeld
              xumingming James Xu
              4 Vote for this issue
              5 Start watching this issue