[SPARK-31173] Spark Kubernetes add tolerations and nodeName support - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Open
Priority: Trivial
Resolution: Unresolved
Affects Version/s: 2.4.6, 3.1.0
Fix Version/s: None
Component/s: Kubernetes, Spark Core
Labels:
- features
Environment:

Alibaba Cloud ACK with spark operator(v1beta2-1.1.0-2.4.5) and spark(2.4.5)

Flags:

Patch

Description

When you run spark on serverless kubernetes cluster(virtual-kubelet). you need to specific the nodeSelectors,tolerations even nodeName when you want to gain better scheduling performance. Currently spark doesn't support tolerations. If you want to use this feature, You must use admission controller webhook to decorate the pod. But the performance is extremely bad. Here is the benchmark.

With webhook

Batch Size: 500 Pod creation: about 7 Pods/s All Pods running: 5min

Without webhook

Batch Size: 500 Pod creation: more than 500 Pods/s All Pods running: 45s

Adding tolerations and nodeName in spark will bring great help when you want to run a large scale job on serverless kubernetes cluster.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: zhongwei liu

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 17/Mar/20 14:28

Updated:: 12/Dec/22 18:10

Time Tracking

Estimated:

72h

Remaining:

72h

Logged:

Not Specified