Details
-
Task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
backlog
-
None
Description
We might need to address kerberization and identity management at some point in bigtop...
- A concrete reason is that the new hadoop versions require kerberos for use of the LinuxContainerExecutor (alterantive to default yarn container executor which just spins up a new JVM - LCE actually logs in as the user submitting the job , and runs with user permissions at the posix level).
- Non HDFS FileSystems require posix identities, not just user name strings like HDFS. So to securely support HDFS alternatives in yarn jobs, linux containers are required.
- Another reason is that enterprises and so on are moving towards first class ID management with hadoop. We can leverage existing identity management tooling to make this a reality in bigtop as well, .
plinnell and cos I think FreeIPA makes it super easy to use DNS + LDAP + Kerberos together. And I think in the enterprise, We will see increasing number of folks wanting to use it in their hadoop workloads. We've already seen how hbase DNS can be tricky anyways. So, I actually think a FreeIPA enabled bigtop distro might be a pretty valuable artifact for the community.
Now... Cos has mentioned some other intriguing ideas around YARN as well. In any case, lets hash out how Identities and kerberos should be managed , if at all, in bigtop.