Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.7.6
-
None
Description
java.lang.RuntimeException: START Host request submission failed: java.lang.RuntimeException: Update Host request submission failed: java.util.ConcurrentModificationException at org.apache.ambari.server.topology.AmbariContext.startHost(AmbariContext.java:497) at org.apache.ambari.server.topology.ClusterTopologyImpl.startHost(ClusterTopologyImpl.java:268) at org.apache.ambari.server.topology.tasks.StartHostTask.runTask(StartHostTask.java:51) at org.apache.ambari.server.topology.tasks.TopologyHostTask.run(TopologyHostTask.java:55) at org.apache.ambari.server.topology.HostOfferResponse$1.run(HostOfferResponse.java:85) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: Update Host request submission failed: java.util.ConcurrentModificationException at org.apache.ambari.server.controller.internal.HostComponentResourceProvider$4.invoke(HostComponentResourceProvider.java:865) at org.apache.ambari.server.controller.internal.HostComponentResourceProvider$4.invoke(HostComponentResourceProvider.java:852) at org.apache.ambari.server.controller.internal.AbstractResourceProvider.invokeWithRetry(AbstractResourceProvider.java:465) at org.apache.ambari.server.controller.internal.AbstractResourceProvider.modifyResources(AbstractResourceProvider.java:346) at org.apache.ambari.server.controller.internal.HostComponentResourceProvider.doUpdateResources(HostComponentResourceProvider.java:852) at org.apache.ambari.server.controller.internal.HostComponentResourceProvider.start(HostComponentResourceProvider.java:492) at org.apache.ambari.server.topology.AmbariContext.startHost(AmbariContext.java:494) at org.apache.ambari.server.topology.ClusterTopologyImpl.startHost(ClusterTopologyImpl.java:268) at org.apache.ambari.server.topology.tasks.StartHostTask.runTask(StartHostTask.java:51) at org.apache.ambari.server.topology.tasks.TopologyHostTask.run(TopologyHostTask.java:55) at org.apache.ambari.server.topology.HostOfferResponse$1.run(HostOfferResponse.java:85) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.util.ConcurrentModificationException: NA at java.util.HashMap$HashIterator.nextNode(HashMap.java:1445) at java.util.HashMap$EntryIterator.next(HashMap.java:1479) at java.util.HashMap$EntryIterator.next(HashMap.java:1477) at java.util.HashMap.putMapEntries(HashMap.java:512) at java.util.HashMap.<init>(HashMap.java:490) at org.apache.ambari.server.topology.HostRequest.getPhysicalTaskMapping(HostRequest.java:458) at org.apache.ambari.server.topology.LogicalRequest.getStageSummaries(LogicalRequest.java:286) at org.apache.ambari.server.topology.TopologyManager.getPendingHostComponents(TopologyManager.java:823) at org.apache.ambari.server.utils.StageUtils.getClusterHostInfo(StageUtils.java:306) at org.apache.ambari.server.controller.AmbariManagementControllerImpl.doStageCreation(AmbariManagementControllerImpl.java:2788) at org.apache.ambari.server.controller.AmbariManagementControllerImpl.addStages(AmbariManagementControllerImpl.java:3513) at org.apache.ambari.server.controller.internal.HostComponentResourceProvider.updateHostComponents(HostComponentResourceProvider.java:707) at org.apache.ambari.server.controller.internal.HostComponentResourceProvider$4.invoke(HostComponentResourceProvider.java:857) at org.apache.ambari.server.controller.internal.HostComponentResourceProvider$4.invoke(HostComponentResourceProvider.java:852) at org.apache.ambari.server.controller.internal.AbstractResourceProvider.invokeWithRetry(AbstractResourceProvider.java:465) at org.apache.ambari.server.controller.internal.AbstractResourceProvider.modifyResources(AbstractResourceProvider.java:346) at org.apache.ambari.server.controller.internal.HostComponentResourceProvider.doUpdateResources(HostComponentResourceProvider.java:852) at org.apache.ambari.server.controller.internal.HostComponentResourceProvider.start(HostComponentResourceProvider.java:492) at org.apache.ambari.server.topology.AmbariContext.startHost(AmbariContext.java:494) at org.apache.ambari.server.topology.ClusterTopologyImpl.startHost(ClusterTopologyImpl.java:268) at org.apache.ambari.server.topology.tasks.StartHostTask.runTask(StartHostTask.java:51) at org.apache.ambari.server.topology.tasks.TopologyHostTask.run(TopologyHostTask.java:55) at org.apache.ambari.server.topology.HostOfferResponse$1.run(HostOfferResponse.java:85) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
My teammate ramkrishna did some analysis on this one by adding logs and latches and found that the installation and registration though done parallely each thread tries to get the entire cluster’s view of the current physical tasks. So it is bound to happen that when a registration is happening the other thread can do a getPhysicalTaskMapping(). (leading to CME)
Attachments
Issue Links
- links to