Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
The TestTryAllocatePreemptNode fails sometimes with a different allocation being preempted than the expected alloc1:
2023-06-07T12:40:35.035+1000 INFO objects/queue.go:139 configured queue added to scheduler {"queueName": "root"} 2023-06-07T12:40:35.035+1000 INFO objects/queue.go:1190 updating root queue max resources {"current max": "nil resource", "new max": "map[first:40]"} 2023-06-07T12:40:35.035+1000 DEBUG objects/queue.go:850 new parent queue inheriting template from parent queue {"child queue": "root.parent", "parent queue": "root"} 2023-06-07T12:40:35.035+1000 INFO objects/queue.go:139 configured queue added to scheduler {"queueName": "root.parent"} 2023-06-07T12:40:35.036+1000 DEBUG objects/queue.go:343 max resources setting ignored: cannot set zero max resources 2023-06-07T12:40:35.036+1000 DEBUG objects/queue.go:350 guaranteed resources setting ignored: cannot set zero max resources 2023-06-07T12:40:35.036+1000 INFO objects/queue.go:139 configured queue added to scheduler {"queueName": "root.unlimited"} 2023-06-07T12:40:35.036+1000 DEBUG objects/queue.go:343 max resources setting ignored: cannot set zero max resources2023-06-07T12:40:35.036+1000 INFO objects/queue.go:139 configured queue added to scheduler {"queueName": "root.parent.child1"} 2023-06-07T12:40:35.036+1000 DEBUG objects/queue.go:343 max resources setting ignored: cannot set zero max resources 2023-06-07T12:40:35.036+1000 INFO objects/queue.go:139 configured queue added to scheduler {"queueName": "root.parent.child2"} 2023-06-07T12:40:35.036+1000 INFO objects/application.go:184 Unknown gang scheduling style, using soft style as default {"gang scheduling style": ""} 2023-06-07T12:40:35.036+1000 DEBUG objects/queue.go:1780 Unknown application {"applicationID": "app-0"} 2023-06-07T12:40:35.037+1000 INFO objects/application_state.go:133 Application state transition {"appID": "app-0", "source": "New", "destination": "Accepted", "event": "runApplication"} 2023-06-07T12:40:35.037+1000 INFO objects/application.go:668 ask added successfully to application {"appID": "app-0", "ask": "alloc0-0", "placeholder": false, "pendingDelta": "map[first:11]"} 2023-06-07T12:40:35.037+1000 INFO objects/application.go:668 ask added successfully to application {"appID": "app-0", "ask": "alloc0-1", "placeholder": false, "pendingDelta": "map[first:11]"} 2023-06-07T12:40:35.037+1000 INFO objects/application.go:184 Unknown gang scheduling style, using soft style as default {"gang scheduling style": ""} 2023-06-07T12:40:35.037+1000 DEBUG objects/queue.go:1780 Unknown application {"applicationID": "app-1"} 2023-06-07T12:40:35.037+1000 INFO objects/application_state.go:133 Application state transition {"appID": "app-1", "source": "New", "destination": "Accepted", "event": "runApplication"} 2023-06-07T12:40:35.037+1000 INFO objects/application.go:668 ask added successfully to application {"appID": "app-1", "ask": "alloc1", "placeholder": false, "pendingDelta": "map[first:5]"} 2023-06-07T12:40:35.037+1000 INFO objects/application.go:668 ask added successfully to application {"appID": "app-1", "ask": "alloc2", "placeholder": false, "pendingDelta": "map[first:5]"} 2023-06-07T12:40:35.038+1000 INFO objects/application.go:184 Unknown gang scheduling style, using soft style as default {"gang scheduling style": ""} 2023-06-07T12:40:35.038+1000 DEBUG objects/queue.go:1780 Unknown application {"applicationID": "app-2"} 2023-06-07T12:40:35.038+1000 INFO objects/application_state.go:133 Application state transition {"appID": "app-2", "source": "New", "destination": "Accepted", "event": "runApplication"} 2023-06-07T12:40:35.038+1000 INFO objects/application.go:668 ask added successfully to application {"appID": "app-2", "ask": "alloc3", "placeholder": false, "pendingDelta": "map[first:5]"} 2023-06-07T12:40:35.038+1000 DEBUG objects/application.go:339 Application state timer initiated {"appID": "app-0", "state": "Starting", "timeout": "5m0s"} 2023-06-07T12:40:35.038+1000 INFO objects/application_state.go:133 Application state transition {"appID": "app-0", "source": "Accepted", "destination": "Starting", "event": "runApplication"} 2023-06-07T12:40:35.038+1000 DEBUG ugm/manager.go:63 Increasing resource usage {"user": "testuser", "queue path": "root.unlimited", "application": "app-0", "resource": "map[first:11]"} 2023-06-07T12:40:35.038+1000 DEBUG ugm/queue_tracker.go:57 Increasing resource usage {"queue path": "root.unlimited", "application": "app-0", "resource": "map[first:11]"} 2023-06-07T12:40:35.038+1000 DEBUG ugm/queue_tracker.go:45 Creating queue tracker object for queue {"queue": "unlimited"} 2023-06-07T12:40:35.038+1000 DEBUG ugm/queue_tracker.go:57 Increasing resource usage {"queue path": "unlimited", "application": "app-0", "resource": "map[first:11]"} 2023-06-07T12:40:35.038+1000 DEBUG ugm/manager.go:265 Group tracker already exists and linking (reusing) the same with application {"application": "app-0", "queue path": "root.unlimited", "user": "testuser", "group": "testgroup"} 2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing resource usage {"queue path": "root.unlimited", "application": "app-0", "resource": "map[first:11]"} 2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:45 Creating queue tracker object for queue {"queue": "unlimited"} 2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing resource usage {"queue path": "unlimited", "application": "app-0", "resource": "map[first:11]"} 2023-06-07T12:40:35.039+1000 DEBUG objects/application.go:388 Application state timer cleared {"appID": "app-0", "state": "Starting"} 2023-06-07T12:40:35.039+1000 INFO objects/application_state.go:133 Application state transition {"appID": "app-0", "source": "Starting", "destination": "Running", "event": "runApplication"} 2023-06-07T12:40:35.039+1000 DEBUG ugm/manager.go:63 Increasing resource usage {"user": "testuser", "queue path": "root.unlimited", "application": "app-0", "resource": "map[first:11]"} 2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing resource usage {"queue path": "root.unlimited", "application": "app-0", "resource": "map[first:11]"} 2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing resource usage {"queue path": "unlimited", "application": "app-0", "resource": "map[first:11]"} 2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing resource usage {"queue path": "root.unlimited", "application": "app-0", "resource": "map[first:11]"} 2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing resource usage {"queue path": "unlimited", "application": "app-0", "resource": "map[first:11]"} 2023-06-07T12:40:35.040+1000 DEBUG objects/application.go:339 Application state timer initiated {"appID": "app-1", "state": "Starting", "timeout": "5m0s"} 2023-06-07T12:40:35.040+1000 INFO objects/application_state.go:133 Application state transition {"appID": "app-1", "source": "Accepted", "destination": "Starting", "event": "runApplication"} 2023-06-07T12:40:35.040+1000 DEBUG ugm/manager.go:63 Increasing resource usage {"user": "testuser", "queue path": "root.parent.child1", "application": "app-1", "resource": "map[first:5]"} 2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing resource usage {"queue path": "root.parent.child1", "application": "app-1", "resource": "map[first:5]"} 2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing resource usage {"queue path": "parent.child1", "application": "app-1", "resource": "map[first:5]"} 2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing resource usage {"queue path": "child1", "application": "app-1", "resource": "map[first:5]"} 2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing resource usage {"queue path": "root.parent.child1", "application": "app-1", "resource": "map[first:5]"} 2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing resource usage {"queue path": "parent.child1", "application": "app-1", "resource": "map[first:5]"} 2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing resource usage {"queue path": "child1", "application": "app-1", "resource": "map[first:5]"} 2023-06-07T12:40:35.041+1000 DEBUG objects/application.go:388 Application state timer cleared {"appID": "app-1", "state": "Starting"} 2023-06-07T12:40:35.041+1000 INFO objects/application_state.go:133 Application state transition {"appID": "app-1", "source": "Starting", "destination": "Running", "event": "runApplication"} 2023-06-07T12:40:35.041+1000 DEBUG ugm/manager.go:63 Increasing resource usage {"user": "testuser", "queue path": "root.parent.child1", "application": "app-1", "resource": "map[first:5]"} 2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing resource usage {"queue path": "root.parent.child1", "application": "app-1", "resource": "map[first:5]"} 2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing resource usage {"queue path": "parent.child1", "application": "app-1", "resource": "map[first:5]"} 2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing resource usage {"queue path": "child1", "application": "app-1", "resource": "map[first:5]"} 2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing resource usage {"queue path": "root.parent.child1", "application": "app-1", "resource": "map[first:5]"} 2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing resource usage {"queue path": "parent.child1", "application": "app-1", "resource": "map[first:5]"} 2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing resource usage {"queue path": "child1", "application": "app-1", "resource": "map[first:5]"} 2023-06-07T12:40:35.041+1000 DEBUG objects/application.go:1454 app reservation check {"allocationKey": "alloc3", "createTime": "2023-06-07T12:40:25.038+1000", "askAge": "10.003814459s", "reservationDelay": "2s"} 2023-06-07T12:40:35.042+1000 DEBUG objects/application.go:1454 app reservation check {"allocationKey": "alloc3", "createTime": "2023-06-07T12:40:25.038+1000", "askAge": "10.003914s", "reservationDelay": "2s"} 2023-06-07T12:40:35.042+1000 DEBUG objects/application.go:1470 found candidate node for app reservation {"appID": "app-2", "nodeID": "node1", "allocationKey": "alloc3", "reservations": 0, "pendingRepeats": 1} 2023-06-07T12:40:35.042+1000 DEBUG objects/application.go:1454 app reservation check {"allocationKey": "alloc3", "createTime": "2023-06-07T12:39:55.038+1000", "askAge": "40.004148584s", "reservationDelay": "2s"} 2023-06-07T12:40:35.042+1000 DEBUG objects/application.go:1454 app reservation check {"allocationKey": "alloc3", "createTime": "2023-06-07T12:39:55.038+1000", "askAge": "40.004227875s", "reservationDelay": "2s"} 2023-06-07T12:40:35.042+1000 DEBUG objects/preemption.go:352 No RM callback plugin registered, using first selected node for preemption {"NodeID": "node2", "AllocationKey": "alloc3"} 2023-06-07T12:40:35.043+1000 ERROR objects/preemption.go:517 BUG: Didn't find instance type in the nodeInstanceTypeMap {"nodeId": "node2"} github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Preemptor).tryNodes /Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/preemption.go:517 github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Preemptor).TryPreemption /Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/preemption.go:537 github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).tryPreemption /Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/application.go:1327 github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).tryAllocate /Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/application.go:1050 github.com/apache/yunikorn-core/pkg/scheduler/objects.TestTryAllocatePreemptNode /Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/application_test.go:1874 testing.tRunner /usr/local/go/src/testing/testing.go:1576 2023-06-07T12:40:35.043+1000 INFO objects/preemption.go:559 Preempting task {"applicationID": "app-1", "allocationKey": "alloc2", "nodeID": "node2", "resources": "map[first:5]"} 2023-06-07T12:40:35.043+1000 INFO objects/preemption.go:580 Reserving node for ask after preemption {"allocationKey": "alloc3", "nodeID": "node2", "victimCount": 1} --- FAIL: TestTryAllocatePreemptNode (0.06s) application_test.go:1876: assertion failed: expression is false: alloc1.IsPreempted(): alloc1 should have been preempted
Attachments
Issue Links
- links to