[SPARK-30949] Driver cores in kubernetes are coupled with container resources, not spark.driver.cores - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.1.0
Fix Version/s: 3.1.0
Component/s: Kubernetes, Spark Core
Labels:
None

Description

Drivers submitted in kubernetes cluster mode set the parallelism of various components like 'RpcEnv', 'MemoryManager', 'BlockManager' from inferring the number of available cores by calling:

Runtime.getRuntime().availableProcessors()

By using this, spark applications running on java 8 or older incorrectly get the total number of cores in the host, ignoring the cgroup limits set by kubernetes (https://bugs.openjdk.java.net/browse/JDK-6515172). Java 9 and newer runtimes do not have this problem.

Orthogonal to this, it is currently not possible to decouple resource limits on the driver container with the amount of parallelism of the various network and memory components listed above.

My proposal is to use the 'spark.driver.cores' configuration to get the amount of parallelism, like we do for YARN (https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkContext.scala#L2762-L2767). This will enable users to specify 'spark.driver.cores' to set parallelism, and specify 'spark.kubernetes.driver.requests.cores' to limit the resource requests of the driver container. Further, this will remove the need to call 'availableProcessors()', thus the same number of cores will be used for parallelism independent of the java runtime version.

Attachments

Issue Links

links to

GitHub Pull Request #27695

Activity

People

Assignee:: Onur Satici

Reporter:: Onur Satici

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 25/Feb/20 16:36

Updated:: 26/Sep/20 23:27

Resolved:: 21/Apr/20 04:33