Thanks for the review Jason Lowe!
I am almost done with a unit test, and am going to update the patch to include it soon. Yes, that's the approach I took (although it took a fair amount of mocking). With the test, you can easily reproduce the bug against the existing code, and confirm the fix.
The System.exit call should be ExitUtil.terminate instead, as it allows unit tests to disable the system exit in case something goes wrong during the test or to verify exit behavior if that's desired.
Good point. I'll change it to ExitUtil.terminate.
I don't see any attempt to stop the task when CONTAINER_REMOTE_CLEANUP is received. Before it didn't make sense to do so because by the time we received it there was no task running. Now that there could be, I think we would want to track the Future for each task submitted that's in-flight and attempt to cancel the running task when the cleanup event is received.
That is quite interesting. In fact, that's the very first approach I took, but when I ran the TestUberAM unit tests, interestingly it resulted in all unit tests hanging indefinitely. I spent some time to try to figure out why it was hanging but wasn't successful.
Anyhow, the reason I arrived at the current fix is that once the state transition is completed, LocalContainerLauncher.stop() is invoked, and that shuts down the ExecutorService for the task runner. And that shut down interrupts any running task. So the end result is pretty much the same.
Let me know what you think. If needed, I can look into the TestUberAM issue when cancelling the future task one more time.