Details
Description
The cause is that the RMContainerImpl instance of reserved container lost its node label expression, when scheduler reserves containers for non-default node-label requests, it will be wrongly added into LeafQueue#ignorePartitionExclusivityRMContainers and never be removed.
To reproduce this memory leak:
(1) create reserved container
RegularContainerAllocator#doAllocation: create RMContainerImpl instanceA (nodeLabelExpression="")
LeafQueue#allocateResource: RMContainerImpl instanceA is put into LeafQueue#ignorePartitionExclusivityRMContainers
(2) allocate from reserved container
RegularContainerAllocator#doAllocation: create RMContainerImpl instanceB (nodeLabelExpression="test-label")
(3) From now on, RMContainerImpl instanceA will be left in memory (be kept in LeafQueue#ignorePartitionExclusivityRMContainers) forever until RM restarted