Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-8775

TestDiskFailures.testLocalDirsFailures sometimes can fail on concurrent File modifications

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0
    • 3.3.0
    • test, yarn
    • Reviewed

    Description

      The test can fail sometimes when file operations were done during the disk health check done by the thread in LocalDirsHandlerService.

      java.lang.AssertionError: NodeManager could not identify disk failure.
      	at org.junit.Assert.fail(Assert.java:88)
      	at org.junit.Assert.assertTrue(Assert.java:41)
      	at org.apache.hadoop.yarn.server.TestDiskFailures.verifyDisksHealth(TestDiskFailures.java:239)
      	at org.apache.hadoop.yarn.server.TestDiskFailures.testDirsFailures(TestDiskFailures.java:202)
      	at org.apache.hadoop.yarn.server.TestDiskFailures.testLocalDirsFailures(TestDiskFailures.java:99)
      
      Stderr
      
      
      2018-09-13 08:21:49,822 INFO [main] server.TestDiskFailures (TestDiskFailures.java:prepareDirToFail(277)) - Prepared /tmp/dist-test-taskjUrf0_/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/target/org.apache.hadoop.yarn.server.TestDiskFailures/org.apache.hadoop.yarn.server.TestDiskFailures-logDir-nm-0_1 to fail.
      2018-09-13 08:21:49,823 INFO [main] server.TestDiskFailures (TestDiskFailures.java:prepareDirToFail(277)) - Prepared /tmp/dist-test-taskjUrf0_/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/target/org.apache.hadoop.yarn.server.TestDiskFailures/org.apache.hadoop.yarn.server.TestDiskFailures-logDir-nm-0_3 to fail.
      2018-09-13 08:21:49,823 WARN [DiskHealthMonitor-Timer] nodemanager.DirectoryCollection (DirectoryCollection.java:checkDirs(283)) - Directory /tmp/dist-test-taskjUrf0_/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/target/org.apache.hadoop.yarn.server.TestDiskFailures/org.apache.hadoop.yarn.server.TestDiskFailures-logDir-nm-0_1 error, Not a directory: /tmp/dist-test-taskjUrf0_/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/target/org.apache.hadoop.yarn.server.TestDiskFailures/org.apache.hadoop.yarn.server.TestDiskFailures-logDir-nm-0_1, removing from list of valid directories
      2018-09-13 08:21:49,824 WARN [DiskHealthMonitor-Timer] localizer.ResourceLocalizationService (ResourceLocalizationService.java:initializeLogDir(1329)) - Could not initialize log dir /tmp/dist-test-taskjUrf0_/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/target/org.apache.hadoop.yarn.server.TestDiskFailures/org.apache.hadoop.yarn.server.TestDiskFailures-logDir-nm-0_3
      java.io.FileNotFoundException: Destination exists and is not a directory: /tmp/dist-test-taskjUrf0_/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/target/org.apache.hadoop.yarn.server.TestDiskFailures/org.apache.hadoop.yarn.server.TestDiskFailures-logDir-nm-0_3
      at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:515)
      at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:496)
      at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1081)
      at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:178)
      at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:205)
      at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:747)
      at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:743)
      at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
      at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:743)
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.initializeLogDir(ResourceLocalizationService.java:1324)
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.initializeLogDirs(ResourceLocalizationService.java:1318)
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.access$000(ResourceLocalizationService.java:141)
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$2.onDirsChanged(ResourceLocalizationService.java:269)
      at org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection.checkDirs(DirectoryCollection.java:317)
      at org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.checkDirs(LocalDirsHandlerService.java:452)
      at org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.access$500(LocalDirsHandlerService.java:52)
      at org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService$MonitoringTimerTask.run(LocalDirsHandlerService.java:166)
      at java.util.TimerThread.mainLoop(Timer.java:555)
      at java.util.TimerThread.run(Timer.java:505)
      2018-09-13 08:21:59,824 INFO [main] server.TestDiskFailures (TestDiskFailures.java:verifyDisksHealth(237)) - ExpectedDirs=
      2018-09-13 08:21:59,825 INFO [main] server.TestDiskFailures (TestDiskFailures.java:verifyDisksHealth(238)) - SeenDirs=/tmp/dist-test-taskjUrf0_/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/target/org.apache.hadoop.yarn.server.TestDiskFailures/org.apache.hadoop.yarn.server.TestDiskFailures-logDir-nm-0_3
      

      Attachments

        1. YARN-8775.001.patch
          7 kB
          Antal Bálint Steinbach
        2. YARN-8775.002.patch
          5 kB
          Antal Bálint Steinbach
        3. YARN-8775.003.patch
          5 kB
          Antal Bálint Steinbach
        4. YARN-8775.004.patch
          5 kB
          Antal Bálint Steinbach

        Activity

          People

            bsteinbach Antal Bálint Steinbach
            bsteinbach Antal Bálint Steinbach
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: