Details

    Description

      After YUNIKORN-2233, the scheduler core can be stopped. This causes an issue inside the MockScheduler:

      2023-12-21T17:58:49.203+0100	INFO	core.scheduler.ugm	ugm/manager.go:136	Removing user from manager	{"user": "testuser"}
      ...
      2023-12-21T17:58:59.209+0100	INFO	core.entrypoint	entrypoint/service_context.go:40	ServiceContext stop all services
      ...
      2023-12-21T17:58:59.211+0100	INFO	core.scheduler.partition	scheduler/partition_manager.go:144	marking all queues for removal	{"partitionName": "[rm:123]default"}
      2023-12-21T17:58:59.211+0100	INFO	core.scheduler.queue	objects/queue.go:952	marking managed queue for deletion	{"queue": "root"}
      2023-12-21T17:58:59.212+0100	INFO	core.scheduler.fsm	objects/object_state.go:81	object transition	{"object": "root", "source": "Active", "destination": "Draining", "event": "Remove"}
      2023-12-21T17:58:59.212+0100	INFO	core.scheduler.queue	objects/queue.go:952	marking managed queue for deletion	{"queue": "root.singleleaf"}
      2023-12-21T17:58:59.212+0100	INFO	core.scheduler.fsm	objects/object_state.go:81	object transition	{"object": "root.singleleaf", "source": "Active", "destination": "Draining", "event": "Remove"}
      2023-12-21T17:58:59.212+0100	INFO	core.scheduler.partition	scheduler/partition_manager.go:150	removing all applications from partition	{"numOfApps": 1, "partitionName": "[rm:123]default"}
      2023-12-21T17:58:59.212+0100	INFO	core.scheduler.application	objects/application.go:608	ask removed successfully from application	{"appID": "app-1", "ask": "", "pendingDelta": "map[memory:0 vcore:0]"}
      2023-12-21T17:58:59.212+0100	INFO	core.scheduler.queue	objects/queue.go:837	Application completed and removed from queue	{"queueName": "root.singleleaf", "applicationID": "app-1"}
      2023-12-21T17:59:32.848+0100	ERROR	core.scheduler.ugm	ugm/manager.go:118	user tracker must be available in userTrackers map	{"user": "testuser"}
      github.com/apache/yunikorn-core/pkg/scheduler/ugm.(*Manager).DecreaseTrackedResource
      	/home/bacskop/repos/yunikorn-core/pkg/scheduler/ugm/manager.go:118
      github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).decUserResourceUsage
      	/home/bacskop/repos/yunikorn-core/pkg/scheduler/objects/application.go:1654
      github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).RemoveAllAllocations
      	/home/bacskop/repos/yunikorn-core/pkg/scheduler/objects/application.go:1843
      github.com/apache/yunikorn-core/pkg/scheduler.(*PartitionContext).removeApplication
      	/home/bacskop/repos/yunikorn-core/pkg/scheduler/partition.go:388
      github.com/apache/yunikorn-core/pkg/scheduler.(*partitionManager).remove
      	/home/bacskop/repos/yunikorn-core/pkg/scheduler/partition_manager.go:156
      github.com/apache/yunikorn-core/pkg/scheduler.(*partitionManager).Stop
      	/home/bacskop/repos/yunikorn-core/pkg/scheduler/partition_manager.go:97
      github.com/apache/yunikorn-core/pkg/scheduler.(*ClusterContext).Stop
      	/home/bacskop/repos/yunikorn-core/pkg/scheduler/context.go:991
      github.com/apache/yunikorn-core/pkg/scheduler.(*Scheduler).Stop
      	/home/bacskop/repos/yunikorn-core/pkg/scheduler/scheduler.go:217
      github.com/apache/yunikorn-core/pkg/entrypoint.(*ServiceContext).StopAll
      	/home/bacskop/repos/yunikorn-core/pkg/entrypoint/service_context.go:50
      github.com/apache/yunikorn-core/pkg/scheduler/tests.(*mockScheduler).Stop
      	/home/bacskop/repos/yunikorn-core/pkg/scheduler/tests/mockscheduler_test.go:91
      github.com/apache/yunikorn-core/pkg/scheduler/tests.TestApplicationHistoryTracking
      	/home/bacskop/repos/yunikorn-core/pkg/scheduler/tests/application_tracking_test.go:172
      

      The problem is that the tracker object no longer exist when PartitionContext.removeApplication() is called. At this point the app is also in Completed state, so it's not necessary to decrement any resource.

      Attachments

        Activity

          People

            yangpoan PoAn Yang
            pbacsko Peter Bacsko
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: