I brought additional guard logic for long idle ShellSpout (
STORM-1928), which resolves the issue, but I found race condition from new logic.
1. lastHeartbeat is not updated more than timeout because of inactivity
2. querySubprocess() is called
3. waitingOnSubprocess is set to true
4. HeartbeatTimerTask.run() triggers faster than updating heartbeat (getting message from subprocess)
Simplest approach is updating heartbeat before set waitingOnSubprocess to true. Last heartbeat time is no longer only from subprocess, but it doesn't harm.