Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.3.0
    • 3.3.0
    • resourcemanager
    • None

    Description

      RM fails to start with below exception when FileSystemBasedConfigurationProvider is used.

      Exception:

      2019-08-16 12:05:33,802 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting ResourceManager
      org.apache.hadoop.service.ServiceStateException: java.io.IOException: java.io.IOException: Filesystem closed
              at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
              at org.apache.hadoop.service.AbstractService.init(AbstractService.java:173)
              at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:109)
              at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:868)
              at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
              at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1281)
              at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.reinitialize(ResourceManager.java:1312)
              at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1335)
              at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1328)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:422)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
              at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1328)
              at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1379)
              at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
              at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1567)
      Caused by: java.io.IOException: java.io.IOException: Filesystem closed
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.FileBasedCSConfigurationProvider.loadConfiguration(FileBasedCSConfigurationProvider.java:64)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:346)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:445)
              at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
              ... 14 more
      Caused by: java.io.IOException: Filesystem closed
              at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:475)
              at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1682)
              at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1586)
              at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1583)
              at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
              at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1598)
              at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1701)
              at org.apache.hadoop.yarn.FileSystemBasedConfigurationProvider.getConfigurationInputStream(FileSystemBasedConfigurationProvider.java:62)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.FileBasedCSConfigurationProvider.loadConfiguration(FileBasedCSConfigurationProvider.java:56)
      

      FileSystemBasedConfigurationProvider uses the cached FileSystem causing the issue.

      Configs:

      <property><name>yarn.resourcemanager.configuration.provider-class&lt;/name><value>org.apache.hadoop.yarn.FileSystemBasedConfigurationProvider</value></property>
      <property><name>yarn.resourcemanager.configuration.file-system-based-store</name><value>/yarn/conf</value></property>
      
      [yarn@yarndocker-1 yarn]$ hadoop fs -ls /yarn/conf
      -rw-r--r--   3 yarn supergroup       4138 2019-08-16 13:09 /yarn/conf/capacity-scheduler.xml
      -rw-r--r--   3 yarn supergroup        494 2019-08-16 11:41 /yarn/conf/core-site.xml
      -rw-r--r--   3 yarn supergroup      11392 2019-08-16 11:52 /yarn/conf/hadoop-policy.xml
      -rw-r--r--   3 yarn supergroup      11492 2019-08-16 11:41 /yarn/conf/yarn-site.xml
      
      

      Attachments

        1. YARN-9755-001.patch
          2 kB
          Prabhu Joseph
        2. YARN-9755-002.patch
          5 kB
          Prabhu Joseph
        3. YARN-9755-003.patch
          8 kB
          Prabhu Joseph
        4. YARN-9755-004.patch
          8 kB
          Prabhu Joseph

        Activity

          People

            prabhujoseph Prabhu Joseph
            prabhujoseph Prabhu Joseph
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: