Child jvm, before exiting, should try and cleanup/kill all its sub-processes
I am not sure about this. So, this will be done like in the finally block of the Child by sending a kill -pid to itself ?
Once a jvm is spawned, its session id should be persisted to task-tracker's private folder (TaskTracker.SUBDIR/pid with 700 permission?)
What's the structure under TaskTracker.SUBDIR/pid ? I suppose the right solution is to use jvm-id.pid. Another solution could be taskAttemptId.[cleanup].pid. If we use taskAttemptId, we should take care of JVM Reuse. If we use jvm-id.pid, when a task is being killed, we can lookup the jvm-id for the task and then pick up the right pid file. Would this work ? Permissions for each of the files should be 600 owned by tasktracker.
Once the jvm exits, this pid file should be deleted
+1. Should be done by the TT.
Upon restart, the pid files in the private folder should be cleaned up (under appropriate owner permissions)
This should be done after sending the kill signal to the files in the folder - because they are all potentially running tasks - the reason of this bug.
pid files should have sufficient information to reconstruct jvm-context object which is required by LinuxTaskController to kill the process under user permission.
We should just follow the path of TaskController.killTaskJVM - that will ensure it will work for all task controllers. Setting permissions to 600 for the pid files owned by TT should be fine.