Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5859

TestResourceLocalizationService#testParallelDownloadAttemptsForPublicResource sometimes fails

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 3.0.0-alpha2
    • Component/s: test
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Saw the following test failure:

      Running org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService
      Tests run: 14, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 12.011 sec <<< FAILURE! - in org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService
      testParallelDownloadAttemptsForPublicResource(org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService)  Time elapsed: 0.586 sec  <<< FAILURE!
      java.lang.AssertionError: null
      	at org.junit.Assert.fail(Assert.java:86)
      	at org.junit.Assert.assertTrue(Assert.java:41)
      	at org.junit.Assert.assertTrue(Assert.java:52)
      	at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService.testParallelDownloadAttemptsForPublicResource(TestResourceLocalizationService.java:2108)
      

      The assert occurred at this place in the code:

            // Waiting for download to start.
            Assert.assertTrue(waitForPublicDownloadToStart(spyService, 1, 200));
      
      1. YARN-5859.001.patch
        3 kB
        Eric Badger
      2. YARN-5859.002.patch
        5 kB
        Eric Badger

        Activity

        Hide
        jlowe Jason Lowe added a comment -

        The test output:

        2016-11-07 20:00:01,393 INFO  [Thread-275] event.AsyncDispatcher (AsyncDispatcher.java:register(213)) - Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationEventType for class org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$f197772f
        2016-11-07 20:00:01,394 INFO  [Thread-275] event.AsyncDispatcher (AsyncDispatcher.java:register(213)) - Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerEventType for class org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$f197772f
        2016-11-07 20:00:01,403 INFO  [Thread-275] nodemanager.DirectoryCollection (DirectoryCollection.java:<init>(185)) - Disk Validator: yarn.nodemanager.disk-validator is loaded.
        2016-11-07 20:00:01,411 INFO  [Thread-275] nodemanager.DirectoryCollection (DirectoryCollection.java:<init>(185)) - Disk Validator: yarn.nodemanager.disk-validator is loaded.
        2016-11-07 20:00:01,554 INFO  [Thread-275] event.AsyncDispatcher (AsyncDispatcher.java:register(213)) - Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.LocalizationEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$$EnhancerByMockitoWithCGLIB$$9a46a6a4
        2016-11-07 20:00:01,555 INFO  [Thread-275] localizer.ResourceLocalizationService (ResourceLocalizationService.java:validateConf(232)) - per directory file limit = 8192
        2016-11-07 20:00:01,596 INFO  [Thread-275] localizer.ResourceLocalizationService (ResourceLocalizationService.java:serviceInit(260)) - Disk Validator: yarn.nodemanager.disk-validator is loaded.
        2016-11-07 20:00:01,598 INFO  [Thread-275] event.AsyncDispatcher (AsyncDispatcher.java:register(213)) - Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.LocalizerEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker
        2016-11-07 20:00:01,612 INFO  [AsyncDispatcher event handler] localizer.ResourceLocalizationService (ResourceLocalizationService.java:addResource(845)) - Downloading public rsrc:{ /tmp, 123, FILE,  }
        

        I'm guessing the 200 millisecond timeout is too short sometimes if the unit test is running in a slow VM or there are other performance hiccups (GC, etc.).

        Show
        jlowe Jason Lowe added a comment - The test output: 2016-11-07 20:00:01,393 INFO [Thread-275] event.AsyncDispatcher (AsyncDispatcher.java:register(213)) - Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationEventType for class org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$f197772f 2016-11-07 20:00:01,394 INFO [Thread-275] event.AsyncDispatcher (AsyncDispatcher.java:register(213)) - Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerEventType for class org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$f197772f 2016-11-07 20:00:01,403 INFO [Thread-275] nodemanager.DirectoryCollection (DirectoryCollection.java:<init>(185)) - Disk Validator: yarn.nodemanager.disk-validator is loaded. 2016-11-07 20:00:01,411 INFO [Thread-275] nodemanager.DirectoryCollection (DirectoryCollection.java:<init>(185)) - Disk Validator: yarn.nodemanager.disk-validator is loaded. 2016-11-07 20:00:01,554 INFO [Thread-275] event.AsyncDispatcher (AsyncDispatcher.java:register(213)) - Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.LocalizationEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$$EnhancerByMockitoWithCGLIB$$9a46a6a4 2016-11-07 20:00:01,555 INFO [Thread-275] localizer.ResourceLocalizationService (ResourceLocalizationService.java:validateConf(232)) - per directory file limit = 8192 2016-11-07 20:00:01,596 INFO [Thread-275] localizer.ResourceLocalizationService (ResourceLocalizationService.java:serviceInit(260)) - Disk Validator: yarn.nodemanager.disk-validator is loaded. 2016-11-07 20:00:01,598 INFO [Thread-275] event.AsyncDispatcher (AsyncDispatcher.java:register(213)) - Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.LocalizerEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker 2016-11-07 20:00:01,612 INFO [AsyncDispatcher event handler] localizer.ResourceLocalizationService (ResourceLocalizationService.java:addResource(845)) - Downloading public rsrc:{ /tmp, 123, FILE, } I'm guessing the 200 millisecond timeout is too short sometimes if the unit test is running in a slow VM or there are other performance hiccups (GC, etc.).
        Hide
        ebadger Eric Badger added a comment -

        Changing all 200ms timeouts to 5s.

        Show
        ebadger Eric Badger added a comment - Changing all 200ms timeouts to 5s.
        Hide
        hadoopqa Hadoop QA added a comment -
        +1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 14s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
        +1 mvninstall 6m 46s trunk passed
        +1 compile 0m 26s trunk passed
        +1 checkstyle 0m 17s trunk passed
        +1 mvnsite 0m 28s trunk passed
        +1 mvneclipse 0m 13s trunk passed
        +1 findbugs 0m 40s trunk passed
        +1 javadoc 0m 17s trunk passed
        +1 mvninstall 0m 23s the patch passed
        +1 compile 0m 24s the patch passed
        +1 javac 0m 24s the patch passed
        -0 checkstyle 0m 14s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 2 new + 93 unchanged - 1 fixed = 95 total (was 94)
        +1 mvnsite 0m 24s the patch passed
        +1 mvneclipse 0m 10s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 0m 45s the patch passed
        +1 javadoc 0m 15s the patch passed
        +1 unit 12m 38s hadoop-yarn-server-nodemanager in the patch passed.
        +1 asflicense 0m 16s The patch does not generate ASF License warnings.
        26m 10s



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:a9ad5d6
        JIRA Issue YARN-5859
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12839377/YARN-5859.001.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 651549df8df4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / b2d4b7b
        Default Java 1.8.0_111
        findbugs v3.0.0
        checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13956/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13956/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/13956/console
        Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 46s trunk passed +1 compile 0m 26s trunk passed +1 checkstyle 0m 17s trunk passed +1 mvnsite 0m 28s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 0m 40s trunk passed +1 javadoc 0m 17s trunk passed +1 mvninstall 0m 23s the patch passed +1 compile 0m 24s the patch passed +1 javac 0m 24s the patch passed -0 checkstyle 0m 14s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 2 new + 93 unchanged - 1 fixed = 95 total (was 94) +1 mvnsite 0m 24s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 45s the patch passed +1 javadoc 0m 15s the patch passed +1 unit 12m 38s hadoop-yarn-server-nodemanager in the patch passed. +1 asflicense 0m 16s The patch does not generate ASF License warnings. 26m 10s Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue YARN-5859 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12839377/YARN-5859.001.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 651549df8df4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / b2d4b7b Default Java 1.8.0_111 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13956/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13956/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13956/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        jlowe Jason Lowe added a comment -

        Thanks for the patch! I noticed there are other rather low timeouts in these tests that weren't updated like these, and I assume they too could also fail if the test machine hiccups.

              // Resource Localization should fail and state is modified accordingly.
              // Also Local should be release on the LocalizedResource.
              Assert
                .assertTrue(waitForResourceState(lr, rls, req,
                  LocalResourceVisibility.PRIVATE, user, appId, ResourceState.FAILED,
                  200));
        
        [...]
        
              // Now waiting for resource download to start. Here actual will not start
              // Only the resources will be populated into pending list.
              Assert
                .assertTrue(waitForPrivateDownloadToStart(rls, localizerId1, 2, 500));
        
        [...]
        
              // Waiting for download to start. This should return false as new download
              // will not start
              Assert.assertFalse(waitForPublicDownloadToStart(spyService, 2, 100));
        
        [...]
        
              // Waiting for download to start. This should return false as new download
              // will not start
              Assert.assertFalse(waitForPublicDownloadToStart(spyService, 1, 100));
        
        Show
        jlowe Jason Lowe added a comment - Thanks for the patch! I noticed there are other rather low timeouts in these tests that weren't updated like these, and I assume they too could also fail if the test machine hiccups. // Resource Localization should fail and state is modified accordingly. // Also Local should be release on the LocalizedResource. Assert .assertTrue(waitForResourceState(lr, rls, req, LocalResourceVisibility.PRIVATE, user, appId, ResourceState.FAILED, 200)); [...] // Now waiting for resource download to start. Here actual will not start // Only the resources will be populated into pending list. Assert .assertTrue(waitForPrivateDownloadToStart(rls, localizerId1, 2, 500)); [...] // Waiting for download to start. This should return false as new download // will not start Assert.assertFalse(waitForPublicDownloadToStart(spyService, 2, 100)); [...] // Waiting for download to start. This should return false as new download // will not start Assert.assertFalse(waitForPublicDownloadToStart(spyService, 1, 100));
        Hide
        ebadger Eric Badger added a comment -

        Changed all timeouts to be 5 seconds within the test.

        Show
        ebadger Eric Badger added a comment - Changed all timeouts to be 5 seconds within the test.
        Hide
        hadoopqa Hadoop QA added a comment -
        +1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 21s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
        +1 mvninstall 9m 24s trunk passed
        +1 compile 0m 34s trunk passed
        +1 checkstyle 0m 20s trunk passed
        +1 mvnsite 0m 32s trunk passed
        +1 mvneclipse 0m 16s trunk passed
        +1 findbugs 0m 49s trunk passed
        +1 javadoc 0m 24s trunk passed
        +1 mvninstall 0m 31s the patch passed
        +1 compile 0m 34s the patch passed
        +1 javac 0m 34s the patch passed
        -0 checkstyle 0m 20s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 3 new + 92 unchanged - 2 fixed = 95 total (was 94)
        +1 mvnsite 0m 36s the patch passed
        +1 mvneclipse 0m 19s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 1m 1s the patch passed
        +1 javadoc 0m 19s the patch passed
        +1 unit 14m 36s hadoop-yarn-server-nodemanager in the patch passed.
        +1 asflicense 0m 19s The patch does not generate ASF License warnings.
        32m 40s



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:a9ad5d6
        JIRA Issue YARN-5859
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12839563/YARN-5859.002.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 8cc40835e5bd 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / f6ffa11
        Default Java 1.8.0_111
        findbugs v3.0.0
        checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13975/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13975/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/13975/console
        Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 21s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 9m 24s trunk passed +1 compile 0m 34s trunk passed +1 checkstyle 0m 20s trunk passed +1 mvnsite 0m 32s trunk passed +1 mvneclipse 0m 16s trunk passed +1 findbugs 0m 49s trunk passed +1 javadoc 0m 24s trunk passed +1 mvninstall 0m 31s the patch passed +1 compile 0m 34s the patch passed +1 javac 0m 34s the patch passed -0 checkstyle 0m 20s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 3 new + 92 unchanged - 2 fixed = 95 total (was 94) +1 mvnsite 0m 36s the patch passed +1 mvneclipse 0m 19s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 1s the patch passed +1 javadoc 0m 19s the patch passed +1 unit 14m 36s hadoop-yarn-server-nodemanager in the patch passed. +1 asflicense 0m 19s The patch does not generate ASF License warnings. 32m 40s Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue YARN-5859 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12839563/YARN-5859.002.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 8cc40835e5bd 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / f6ffa11 Default Java 1.8.0_111 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13975/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13975/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13975/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        jlowe Jason Lowe added a comment -

        +1 lgtm. Committing this.

        Show
        jlowe Jason Lowe added a comment - +1 lgtm. Committing this.
        Hide
        jlowe Jason Lowe added a comment -

        Thanks, Eric! I committed this to trunk, branch-2, and branch-2.8.

        Show
        jlowe Jason Lowe added a comment - Thanks, Eric! I committed this to trunk, branch-2, and branch-2.8.
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10869 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10869/)
        YARN-5859. (jlowe: rev 009452bb6dbe5dffb0b304d67a2f360fe0eee1e2)

        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10869 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10869/ ) YARN-5859 . (jlowe: rev 009452bb6dbe5dffb0b304d67a2f360fe0eee1e2) (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java

          People

          • Assignee:
            ebadger Eric Badger
            Reporter:
            jlowe Jason Lowe
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development