Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
2.4.4
-
None
Description
Today Cluster Managers are bundled with Spark and it is hard to add new ones. Kubernetes forked the code to build it and then bring it into Spark. Lots of work is still going on with the Kubernetes cluster manager. It should be able to ship more often if Spark had a pluggable way to bring in Cluster Managers. This will also benefit enterprise companies that have their own cluster managers that aren't open source, so can't be part of Spark itself.
High level idea to be discussed for additional options:
1. Make the cluster manager pluggable.
2. Have the Spark Standalone cluster manager ship with Spark by default and be the base cluster manager others can inherit from. Others can be shipped or not shipped at same time.
3. Each Cluster Manager can ship additional jars that can be placed inside Spark, then with a configuration file define the cluster manager Spark runs with.
4. The configuration file can define which classes to use for the various parts. Can reuse files from Spark Standalone Cluster Manager or say to use a different one.
5. Based on the classes that are allowed to be switched out in the Spark code we can use code like the following to load a different class.
–val clazz = Class.forName("<Some Scheduler Class we got the class name from configuration file")
val cons = clazz.getConstructor(classOf[SparkContext])
cons.newInstance(sc).asInstanceOf[TaskSchedulerImpl]