This appears to have been introduced in the 1.0.0 release. The issues does not seem to affect 0.10.2.
When a topology is killed and workers receive the notification to shutdown, they do not shutdown cleanly, so worker hooks never get invoked.
When a worker shuts down cleanly, the worker logs should contain entries such as the following:
In the 1.0.x line of releases (and presumably 1.x, though I haven't checked) this does not happen – the worker shutdown process appears to get stuck shutting down executors (https://github.com/apache/storm/blob/v1.0.2/storm-core/src/clj/org/apache/storm/daemon/worker.clj#L666), no further log messages are seen in the worker log, and worker hooks do not run.
There are two properties that affect how workers exit. The first is the configuration property supervisor.worker.shutdown.sleep.secs, which defaults to 1 second. This corresponds to how long the supervisor will wait for a worker to exit gracefully before forcibly killing it with kill -9. When this happens the supervisor will log that the worker terminated with exit code 137 (128 + 9).
The second property is a hard-coded 1 second delay (https://github.com/apache/storm/blob/v1.0.2/storm-core/src/clj/org/apache/storm/util.clj#L463) added as a shutdown hook that will call Runtime.halt() if the delay is exceeded. When this happens, the supervisor will log that the worker terminated with exit code 20 (hard-coded).
Side Note: The hardcoded halt delay in worker.clj and the default value for supervisor.worker.shutdown.sleep.secs both being 1 second should probably be changed since it creates a race to see whether the supervisor delay or the worker delay wins.
To test this, I set supervisor.worker.shutdown.sleep.secs to 15 to allow plenty of time for the worker to exit gracefully, and deployed and killed a topology. In this case the supervisor consistently reported exit code 20 for the worker, indicating the hard-coded shutdown hook caused the worker to exit.
I thought the hard-coded 1 second shutdown hook delay might not be long enough for the worker to shutdown cleanly. To test that hypothesis, I changed the hard-code delay to 10 seconds, leaving supervisor.worker.shutdown.sleep.secs at 15 seconds. Again supervisor reported an exit code of 20 for the worker, and there were no log messages indicating the worker had exited cleanly and that the worker hook had run.