Details
-
Umbrella
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
A quick description:
Currently driver and executors communicate through an insecure channel, so anyone can listen on the network and see what's going on. That prevents Spark from adding some features securely (e.g. SPARK-5342, SPARK-5682) without resorting to using internal Hadoop APIs.
Spark 1.3.0 will add SSL support, but properly configuring SSL is not a trivial task for operators, let alone users.
In light of those, we should add a more transparent secure transport layer. I've written a short spec to identify the areas in Spark that need work to achieve this, and I'll attach the document to this issue shortly.
Note I'm restricting things to Yarn currently, because as far as I know it's the only cluster manager that provides the needed security features to bootstrap the secure Spark transport. The design itself doesn't really rely on Yarn per se, just on a secure way to distribute the initial secret (which the Yarn/HDFS combo provides).