Uploaded image for project: 'Apache YuniKorn'
  1. Apache YuniKorn
  2. YUNIKORN-1790

TestTryAllocatePreemptNode: flaky test

    XMLWordPrintableJSON

Details

    Description

      The TestTryAllocatePreemptNode fails sometimes with a different allocation being preempted than the expected alloc1:

      2023-06-07T12:40:35.035+1000	INFO	objects/queue.go:139	configured queue added to scheduler	{"queueName": "root"}
      2023-06-07T12:40:35.035+1000	INFO	objects/queue.go:1190	updating root queue max resources	{"current max": "nil resource", "new max": "map[first:40]"}
      2023-06-07T12:40:35.035+1000	DEBUG	objects/queue.go:850	new parent queue inheriting template from parent queue	{"child queue": "root.parent", "parent queue": "root"}
      2023-06-07T12:40:35.035+1000	INFO	objects/queue.go:139	configured queue added to scheduler	{"queueName": "root.parent"}
      2023-06-07T12:40:35.036+1000	DEBUG	objects/queue.go:343	max resources setting ignored: cannot set zero max resources
      2023-06-07T12:40:35.036+1000	DEBUG	objects/queue.go:350	guaranteed resources setting ignored: cannot set zero max resources
      2023-06-07T12:40:35.036+1000	INFO	objects/queue.go:139	configured queue added to scheduler	{"queueName": "root.unlimited"}
      2023-06-07T12:40:35.036+1000	DEBUG	objects/queue.go:343	max resources setting ignored: cannot set zero max resources2023-06-07T12:40:35.036+1000	INFO	objects/queue.go:139	configured queue added to scheduler	{"queueName": "root.parent.child1"}
      2023-06-07T12:40:35.036+1000	DEBUG	objects/queue.go:343	max resources setting ignored: cannot set zero max resources
      2023-06-07T12:40:35.036+1000	INFO	objects/queue.go:139	configured queue added to scheduler	{"queueName": "root.parent.child2"}
      2023-06-07T12:40:35.036+1000	INFO	objects/application.go:184	Unknown gang scheduling style, using soft style as default	{"gang scheduling style": ""}
      2023-06-07T12:40:35.036+1000	DEBUG	objects/queue.go:1780	Unknown application	{"applicationID": "app-0"}
      2023-06-07T12:40:35.037+1000	INFO	objects/application_state.go:133	Application state transition	{"appID": "app-0", "source": "New", "destination": "Accepted", "event": "runApplication"}
      2023-06-07T12:40:35.037+1000	INFO	objects/application.go:668	ask added successfully to application	{"appID": "app-0", "ask": "alloc0-0", "placeholder": false, "pendingDelta": "map[first:11]"}
      2023-06-07T12:40:35.037+1000	INFO	objects/application.go:668	ask added successfully to application	{"appID": "app-0", "ask": "alloc0-1", "placeholder": false, "pendingDelta": "map[first:11]"}
      2023-06-07T12:40:35.037+1000	INFO	objects/application.go:184	Unknown gang scheduling style, using soft style as default	{"gang scheduling style": ""}
      2023-06-07T12:40:35.037+1000	DEBUG	objects/queue.go:1780	Unknown application	{"applicationID": "app-1"}
      2023-06-07T12:40:35.037+1000	INFO	objects/application_state.go:133	Application state transition	{"appID": "app-1", "source": "New", "destination": "Accepted", "event": "runApplication"}
      2023-06-07T12:40:35.037+1000	INFO	objects/application.go:668	ask added successfully to application	{"appID": "app-1", "ask": "alloc1", "placeholder": false, "pendingDelta": "map[first:5]"}
      2023-06-07T12:40:35.037+1000	INFO	objects/application.go:668	ask added successfully to application	{"appID": "app-1", "ask": "alloc2", "placeholder": false, "pendingDelta": "map[first:5]"}
      2023-06-07T12:40:35.038+1000	INFO	objects/application.go:184	Unknown gang scheduling style, using soft style as default	{"gang scheduling style": ""}
      2023-06-07T12:40:35.038+1000	DEBUG	objects/queue.go:1780	Unknown application	{"applicationID": "app-2"}
      2023-06-07T12:40:35.038+1000	INFO	objects/application_state.go:133	Application state transition	{"appID": "app-2", "source": "New", "destination": "Accepted", "event": "runApplication"}
      2023-06-07T12:40:35.038+1000	INFO	objects/application.go:668	ask added successfully to application	{"appID": "app-2", "ask": "alloc3", "placeholder": false, "pendingDelta": "map[first:5]"}
      2023-06-07T12:40:35.038+1000	DEBUG	objects/application.go:339	Application state timer initiated	{"appID": "app-0", "state": "Starting", "timeout": "5m0s"}
      2023-06-07T12:40:35.038+1000	INFO	objects/application_state.go:133	Application state transition	{"appID": "app-0", "source": "Accepted", "destination": "Starting", "event": "runApplication"}
      2023-06-07T12:40:35.038+1000	DEBUG	ugm/manager.go:63	Increasing resource usage	{"user": "testuser", "queue path": "root.unlimited", "application": "app-0", "resource": "map[first:11]"}
      2023-06-07T12:40:35.038+1000	DEBUG	ugm/queue_tracker.go:57	Increasing resource usage	{"queue path": "root.unlimited", "application": "app-0", "resource": "map[first:11]"}
      2023-06-07T12:40:35.038+1000	DEBUG	ugm/queue_tracker.go:45	Creating queue tracker object for queue	{"queue": "unlimited"}
      2023-06-07T12:40:35.038+1000	DEBUG	ugm/queue_tracker.go:57	Increasing resource usage	{"queue path": "unlimited", "application": "app-0", "resource": "map[first:11]"}
      2023-06-07T12:40:35.038+1000	DEBUG	ugm/manager.go:265	Group tracker already exists and linking (reusing) the same with application	{"application": "app-0", "queue path": "root.unlimited", "user": "testuser", "group": "testgroup"}
      2023-06-07T12:40:35.039+1000	DEBUG	ugm/queue_tracker.go:57	Increasing resource usage	{"queue path": "root.unlimited", "application": "app-0", "resource": "map[first:11]"}
      2023-06-07T12:40:35.039+1000	DEBUG	ugm/queue_tracker.go:45	Creating queue tracker object for queue	{"queue": "unlimited"}
      2023-06-07T12:40:35.039+1000	DEBUG	ugm/queue_tracker.go:57	Increasing resource usage	{"queue path": "unlimited", "application": "app-0", "resource": "map[first:11]"}
      2023-06-07T12:40:35.039+1000	DEBUG	objects/application.go:388	Application state timer cleared	{"appID": "app-0", "state": "Starting"}
      2023-06-07T12:40:35.039+1000	INFO	objects/application_state.go:133	Application state transition	{"appID": "app-0", "source": "Starting", "destination": "Running", "event": "runApplication"}
      2023-06-07T12:40:35.039+1000	DEBUG	ugm/manager.go:63	Increasing resource usage	{"user": "testuser", "queue path": "root.unlimited", "application": "app-0", "resource": "map[first:11]"}
      2023-06-07T12:40:35.039+1000	DEBUG	ugm/queue_tracker.go:57	Increasing resource usage	{"queue path": "root.unlimited", "application": "app-0", "resource": "map[first:11]"}
      2023-06-07T12:40:35.039+1000	DEBUG	ugm/queue_tracker.go:57	Increasing resource usage	{"queue path": "unlimited", "application": "app-0", "resource": "map[first:11]"}
      2023-06-07T12:40:35.039+1000	DEBUG	ugm/queue_tracker.go:57	Increasing resource usage	{"queue path": "root.unlimited", "application": "app-0", "resource": "map[first:11]"}
      2023-06-07T12:40:35.039+1000	DEBUG	ugm/queue_tracker.go:57	Increasing resource usage	{"queue path": "unlimited", "application": "app-0", "resource": "map[first:11]"}
      2023-06-07T12:40:35.040+1000	DEBUG	objects/application.go:339	Application state timer initiated	{"appID": "app-1", "state": "Starting", "timeout": "5m0s"}
      2023-06-07T12:40:35.040+1000	INFO	objects/application_state.go:133	Application state transition	{"appID": "app-1", "source": "Accepted", "destination": "Starting", "event": "runApplication"}
      2023-06-07T12:40:35.040+1000	DEBUG	ugm/manager.go:63	Increasing resource usage	{"user": "testuser", "queue path": "root.parent.child1", "application": "app-1", "resource": "map[first:5]"}
      2023-06-07T12:40:35.040+1000	DEBUG	ugm/queue_tracker.go:57	Increasing resource usage	{"queue path": "root.parent.child1", "application": "app-1", "resource": "map[first:5]"}
      2023-06-07T12:40:35.040+1000	DEBUG	ugm/queue_tracker.go:57	Increasing resource usage	{"queue path": "parent.child1", "application": "app-1", "resource": "map[first:5]"}
      2023-06-07T12:40:35.040+1000	DEBUG	ugm/queue_tracker.go:57	Increasing resource usage	{"queue path": "child1", "application": "app-1", "resource": "map[first:5]"}
      2023-06-07T12:40:35.040+1000	DEBUG	ugm/queue_tracker.go:57	Increasing resource usage	{"queue path": "root.parent.child1", "application": "app-1", "resource": "map[first:5]"}
      2023-06-07T12:40:35.040+1000	DEBUG	ugm/queue_tracker.go:57	Increasing resource usage	{"queue path": "parent.child1", "application": "app-1", "resource": "map[first:5]"}
      2023-06-07T12:40:35.040+1000	DEBUG	ugm/queue_tracker.go:57	Increasing resource usage	{"queue path": "child1", "application": "app-1", "resource": "map[first:5]"}
      2023-06-07T12:40:35.041+1000	DEBUG	objects/application.go:388	Application state timer cleared	{"appID": "app-1", "state": "Starting"}
      2023-06-07T12:40:35.041+1000	INFO	objects/application_state.go:133	Application state transition	{"appID": "app-1", "source": "Starting", "destination": "Running", "event": "runApplication"}
      2023-06-07T12:40:35.041+1000	DEBUG	ugm/manager.go:63	Increasing resource usage	{"user": "testuser", "queue path": "root.parent.child1", "application": "app-1", "resource": "map[first:5]"}
      2023-06-07T12:40:35.041+1000	DEBUG	ugm/queue_tracker.go:57	Increasing resource usage	{"queue path": "root.parent.child1", "application": "app-1", "resource": "map[first:5]"}
      2023-06-07T12:40:35.041+1000	DEBUG	ugm/queue_tracker.go:57	Increasing resource usage	{"queue path": "parent.child1", "application": "app-1", "resource": "map[first:5]"}
      2023-06-07T12:40:35.041+1000	DEBUG	ugm/queue_tracker.go:57	Increasing resource usage	{"queue path": "child1", "application": "app-1", "resource": "map[first:5]"}
      2023-06-07T12:40:35.041+1000	DEBUG	ugm/queue_tracker.go:57	Increasing resource usage	{"queue path": "root.parent.child1", "application": "app-1", "resource": "map[first:5]"}
      2023-06-07T12:40:35.041+1000	DEBUG	ugm/queue_tracker.go:57	Increasing resource usage	{"queue path": "parent.child1", "application": "app-1", "resource": "map[first:5]"}
      2023-06-07T12:40:35.041+1000	DEBUG	ugm/queue_tracker.go:57	Increasing resource usage	{"queue path": "child1", "application": "app-1", "resource": "map[first:5]"}
      2023-06-07T12:40:35.041+1000	DEBUG	objects/application.go:1454	app reservation check	{"allocationKey": "alloc3", "createTime": "2023-06-07T12:40:25.038+1000", "askAge": "10.003814459s", "reservationDelay": "2s"}
      2023-06-07T12:40:35.042+1000	DEBUG	objects/application.go:1454	app reservation check	{"allocationKey": "alloc3", "createTime": "2023-06-07T12:40:25.038+1000", "askAge": "10.003914s", "reservationDelay": "2s"}
      2023-06-07T12:40:35.042+1000	DEBUG	objects/application.go:1470	found candidate node for app reservation	{"appID": "app-2", "nodeID": "node1", "allocationKey": "alloc3", "reservations": 0, "pendingRepeats": 1}
      2023-06-07T12:40:35.042+1000	DEBUG	objects/application.go:1454	app reservation check	{"allocationKey": "alloc3", "createTime": "2023-06-07T12:39:55.038+1000", "askAge": "40.004148584s", "reservationDelay": "2s"}
      2023-06-07T12:40:35.042+1000	DEBUG	objects/application.go:1454	app reservation check	{"allocationKey": "alloc3", "createTime": "2023-06-07T12:39:55.038+1000", "askAge": "40.004227875s", "reservationDelay": "2s"}
      2023-06-07T12:40:35.042+1000	DEBUG	objects/preemption.go:352	No RM callback plugin registered, using first selected node for preemption	{"NodeID": "node2", "AllocationKey": "alloc3"}
      2023-06-07T12:40:35.043+1000	ERROR	objects/preemption.go:517	BUG: Didn't find instance type in the nodeInstanceTypeMap	{"nodeId": "node2"}
      github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Preemptor).tryNodes
      	/Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/preemption.go:517
      github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Preemptor).TryPreemption
      	/Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/preemption.go:537
      github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).tryPreemption
      	/Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/application.go:1327
      github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).tryAllocate
      	/Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/application.go:1050
      github.com/apache/yunikorn-core/pkg/scheduler/objects.TestTryAllocatePreemptNode
      	/Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/application_test.go:1874
      testing.tRunner	/usr/local/go/src/testing/testing.go:1576
      2023-06-07T12:40:35.043+1000	INFO	objects/preemption.go:559	Preempting task	{"applicationID": "app-1", "allocationKey": "alloc2", "nodeID": "node2", "resources": "map[first:5]"}
      2023-06-07T12:40:35.043+1000	INFO	objects/preemption.go:580	Reserving node for ask after preemption	{"allocationKey": "alloc3", "nodeID": "node2", "victimCount": 1}
      --- FAIL: TestTryAllocatePreemptNode (0.06s)
          application_test.go:1876: assertion failed: expression is false: alloc1.IsPreempted(): alloc1 should have been preempted 

      Attachments

        Issue Links

          Activity

            People

              zhuqi Qi Zhu
              wilfreds Wilfred Spiegelenburg
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: