Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3962

If we change node manager identity to run as virtual account, then resource localization service fails to start with incorrect permission

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • 2.6.0
    • None
    • nodemanager

    Description

      For azure hdinsight we need to change node manager to run as virtual account instead of use account. Else after azure reimage, it wont be able to access the map output data of the running job in that node. But when we changed the nodemanager to run as virtual account we got this error,
      2015-06-02 06:11:45,281 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Writing credentials to the nmPrivate file c:/apps1/temp/hdfs/nm-local-dir/nmPrivate/container_1433128260970_0007_01_000001.tokens. Credentials list:
      2015-06-02 06:11:45,313 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Permissions incorrectly set for dir c:/apps1/temp/hdfs/nm-local-dir/usercache, should be rwxr-xr-x, actual value = rwxrwxr-x
      2015-06-02 06:11:45,313 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Attempting to initialize c:/apps1/temp/hdfs/nm-local-dir
      2015-06-02 06:11:45,375 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Permissions incorrectly set for dir c:/apps1/temp/hdfs/nm-local-dir/usercache, should be rwxr-xr-x, actual value = rwxrwxr-x
      2015-06-02 06:11:45,375 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to setup local dir c:/apps1/temp/hdfs/nm-local-dir, which was marked as good.
      org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Permissions incorrectly set for dir c:/apps1/temp/hdfs/nm-local-dir/usercache, should be rwxr-xr-x, actual value = rwxrwxr-x
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.checkLocalDir(ResourceLocalizationService.java:1400)
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.getInitializedLocalDirs(ResourceLocalizationService.java:1367)
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.access$900(ResourceLocalizationService.java:137)
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1085)
      2015-06-02 06:11:45,375 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Localizer failed
      org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local dir c:/apps1/temp/hdfs/nm-local-dir, which was marked as good.
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.getInitializedLocalDirs(ResourceLocalizationService.java:1372)
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.access$900(ResourceLocalizationService.java:137)
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1085)
      Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Permissions incorrectly set for dir c:/apps1/temp/hdfs/nm-local-dir/usercache, should be rwxr-xr-x, actual value = rwxrwxr-x
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.checkLocalDir(ResourceLocalizationService.java:1400)
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.getInitializedLocalDirs(ResourceLocalizationService.java:1367)
      Fix - When node manager runs as virtual account, the resourcelocalization service fails to come. It checks for the permission of usercache and file cache to be 755 and nmPrivate to be 700. But in windows, for virtual account, the owner and group is same. So this pemrission check fails. So added a check that is user is equal to group, then umask validation dont hold

      Attachments

        1. YARN-3962.003.patch
          4 kB
          Íñigo Goiri
        2. YARN-3962-002.patch
          3 kB
          Brahma Reddy Battula
        3. Yarn-3962.001.patch
          8 kB
          madhumita chakraborty

        Activity

          People

            Unassigned Unassigned
            madhuch-ms madhumita chakraborty
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated: