To answer Sanjay's questions in HADOOP-6484:
Benefits and target audience: Is this work targeted for managing/running hadoop for developers or for production use? Briefly describe the benefits.
- it could be used for better deployment/management of the services in small clusters, where the memory requirements of the NN and JT aren't great, and being able to deploy in a single process the entire set of services for a worker node or a (single) master node would result in a lighter system load.
- If the TT started (marked) tasks within the OSGi container (or a preloaded peer OSGi container), Map and Reduce jobs would be able to execute without all the JVM startup delays.
Besides adding the manifests to jar files will it require adding more invasive changes such as special interfaces for stopping and starting hadoop daemons?
- Adding the headers will have no impact on the existing daemons, because they don't run in an OSGi container.
- Nor does any of the Hadoop code play games with classloaders, which is one thing that OSGi does differently.
- HADOOP-5731 shows a problem which existed when trying to run IPC under a security manager; this may be a barrier to OSGI Container use. If it exists client-side that is something that may need fixing anyway, if it is still there after a switch to protobuf everywhere.
- the MRv2 service model could be re-used by some OSGi helper code that could manage the lifecycle of things, because you no longer need per-service code to start/stop services.
- I'd expect there to be some new entry points needed to start the services under OSGi, but they should be wrapper layers on the existing code. If they depended on OSGi services they could be off to one side; if they needed to be in the same package as existing stuff things might get trickier.
Will this be used for management after deployment has been done through some other mechanism or will this work also enable the deployment in a cluster?
- Karaf is interesting in that not only is it yet-another-OSGi container, it is one that has a built in SSHD, so anyone can ssh in remotely, authenticate themselves and issue management commands: start/stop services, see logs, etc:
http://felix.apache.org/site/41-console-and-commands.html -one that works on Windows too, which doesn't normally ship with an sshd.
- I wonder if you get at the logs through karaf -including any from jobs stored on the workers? That would be useful.
- Karaf itself doesn't do remote deployment, AFAIK. Bringing up a zookeeper client on each karaf instance and waiting for instructions via ZK could always be possible.
Overall, I think it could be good, adding the headers is low risk, other features could be useful, though it will take some work to see what problems arise.