Description
Configuration.overlay() is not thread-safe and can be the cause of ConcurrentModificationException since we use iteration over Properties object.
private void overlay(Properties to, Properties from) { for (Entry<Object, Object> entry: from.entrySet()) { to.put(entry.getKey(), entry.getValue()); } }
Properties class is thread-safe but iterator is not. We should manually synchronize on the returned set of entries which we use for iteration.
We faced with ResourceManger fails during recovery caused by ConcurrentModificationException:
2018-10-12 08:00:56,968 INFO org.apache.hadoop.service.AbstractService: Service ResourceManager failed in state STARTED; cause: java.util.ConcurrentModificationException java.util.ConcurrentModificationException at java.util.Hashtable$Enumerator.next(Hashtable.java:1383) at org.apache.hadoop.conf.Configuration.overlay(Configuration.java:2801) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2696) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2632) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2528) at org.apache.hadoop.conf.Configuration.get(Configuration.java:1062) at org.apache.hadoop.conf.Configuration.getStringCollection(Configuration.java:1914) at org.apache.hadoop.security.alias.CredentialProviderFactory.getProviders(CredentialProviderFactory.java:53) at org.apache.hadoop.conf.Configuration.getPasswordFromCredentialProviders(Configuration.java:2043) at org.apache.hadoop.conf.Configuration.getPassword(Configuration.java:2023) at org.apache.hadoop.yarn.webapp.util.WebAppUtils.getPassword(WebAppUtils.java:452) at org.apache.hadoop.yarn.webapp.util.WebAppUtils.loadSslConfiguration(WebAppUtils.java:428) at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:293) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startWepApp(ResourceManager.java:1017) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1117) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1251) 2018-10-12 08:00:56,968 INFO org.apache.hadoop.yarn.server.resourcemanager.security.RMDelegationTokenSecretManager: removing RMDelegation token with sequence number: 3489914 2018-10-12 08:00:56,968 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Removing RMDelegationToken and SequenceNumber 2018-10-12 08:00:56,968 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore: Removing RMDelegationToken_3489914 2018-10-12 08:00:56,969 INFO org.apache.hadoop.ipc.Server: Stopping server on 8032