[SPARK-23825] [K8s] Spark pods should request memory + memoryOverhead as resources - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.3.0
Fix Version/s: 2.4.0
Component/s: Kubernetes, Spark Core
Labels:
None

Description

We currently request spark.[driver,executor].memory as memory from Kubernetes (e.g., here).
The limit is set to spark.[driver,executor].memory + spark.kubernetes.[driver,executor].memoryOverhead.
This seems to be using Kubernetes wrong.
How Pods with resource limits are run, states:

If a Container exceeds its memory request, it is likely that its Pod will be evicted whenever the node runs out of memory.

Thus, if a the spark driver/executor uses memory + memoryOverhead memory, it can be evicted. While an executor might get restarted (but it would still be very bad performance-wise), the driver would be hard to recover.

I think spark should be able to run with the requested (and, thus, guaranteed) resources from Kubernetes without being in danger of termination without needing to rely on optional available resources.

Thus, we shoud request memory + memoryOverhead memory from Kubernetes (and this should also be the limit).

Attachments

Issue Links

links to

[Github] Pull Request #20943 (dvogelbacher)

Activity

People

Assignee:: Unassigned

Reporter:: David Vogelbacher

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 29/Mar/18 21:59

Updated:: 17/May/20 18:23

Resolved:: 02/Apr/18 19:02