Details
-
Test
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
First test generic placeholder pre-emption:
Create a job with placeholders:
small enough to fits in the queue
larger than the free space of the queue.
This will leave the placeholder running for a long time as they need to timeout.
Hopefully it generates a node that is fully loaded with gang placeholders.
Then create a daemon set that must run on the node.
The size of the daemon set must be large enough so that it does not fit on the node
That should trigger the placeholder pre-emption.
Before the fix: the placeholder data for the app does not show that the placeholder was removed
After the fix: the placeholder data for the app shows a removed placeholder
Second test is node removal with placeholders:
Create a job with placeholders:
small enough to fits in the queue
larger than the free space of the queue.
This will leave the placeholder running for a long time as they need to timeout.
remove the node with at least 1 placeholder
Before the fix: the placeholder data for the app does not show that the placeholder was removed
After the fix: the placeholder data for the app shows a removed placeholder
Third test is kill a placeholder:
Create a job with placeholders:
small enough to fits in the queue
larger than the free space of the queue.
This will leave the placeholder running for a long time as they need to timeout.
mimic a removal of the placeholder via kubectl by creating a allocation release request and send that to the partition with the termination type STOPPED_BY_RM
Before the fix: the placeholder data for the app does not show that the placeholder was removed
After the fix: the placeholder data for the app shows a removed placeholder
Working config:
queue quota max size: 16GB / 16cpu
nodes: 2 * 8GB / 8 cpu
create an application with allocation: 4 GB / 4 cpu
create an gang application requesting: 7 * 2GB / 2cpu
create a daemon set pod for one of the nodes asking for 1GB / 1 cpu
Attachments
Attachments
Issue Links
- split from
-
YUNIKORN-1395 Account for preempted placeholder in the placeholder data
- Closed
- links to