Without this patch, the test case would time out (the testcase timeout is 200s) and you should see the following in the logs:
2017-08-03 11:58:04,094 INFO container.ContainerImpl (ContainerImpl.java:transition(1382)) - Relaunching Container [container_1501786677410_0001_01_000002] for re-initialization !!
2017-08-03 11:58:04,094 INFO container.ContainerImpl (ContainerImpl.java:handle(1691)) - Container container_1501786677410_0001_01_000002 transitioned from REINITIALIZING to SCHEDULED
2017-08-03 11:58:04,094 WARN scheduler.ContainerScheduler (ContainerScheduler.java:pickOpportunisticContainersToKill(384)) - There are no sufficient resources to start guaranteed [container_1501786677410_0001_01_000002]at the moment. Opportunistic containers are in the process ofbeing killed to make room.
With the patch, if the test does fail for you - it might be due to some other assertion failure, not a timeout. And you should not see the above call to pickOpportunisticContainersToKill() in the logs.
During container re-initialization, the container process is killed and re-launched. This transfers control back to the ContainerScheduler, which, after
YARN-6706 always checks to see if resources are available to launch the container, irrespective of whether queuing is turned on or off. Un-fortunately, when the container was killed for re-initialization, we had neglected to subtract (reclaim) the containers resources from the utilization tracker, due to which the afore mentioned check fails on re launch. This patch makes sure the resources are reclaimed.