Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10197

TestFsDatasetCache failing intermittently due to timeout

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.8.0, 3.0.0-alpha1
    • test
    • None

    Description

      In TestFsDatasetCache, the unit tests failed sometimes. I collected some failed reason in recent jenkins reports. They are all timeout errors.

      Tests in error: 
        TestFsDatasetCache.testFilesExceedMaxLockedMemory:378 ? Timeout Timed out wait...
        TestFsDatasetCache.tearDown:149 ? Timeout Timed out waiting for condition. Thr...
      
      Tests in error: 
        TestFsDatasetCache.testPageRounder:474 ?  test timed out after 60000 milliseco...
        TestBalancer.testUnknownDatanodeSimple:1040->testUnknownDatanode:1098 ?  test ...
      

      But there was a little different between these failure.

      • The first because the total block time was exceed the waitTimeMillis(here is 60s) then throw the timeout exception and print thread diagnostic string in method DFSTestUtil#verifyExpectedCacheUsage.
            long st = Time.now();
            do {
              boolean result = check.get();
              if (result) {
                return;
              }
              
              Thread.sleep(checkEveryMillis);
            } while (Time.now() - st < waitForMillis);
            
            throw new TimeoutException("Timed out waiting for condition. " +
                "Thread diagnostics:\n" +
                TimedOutTestsListener.buildThreadDiagnosticString());
        
      • The second is due to test elapsed time more than timeout time setting. Like in TestFsDatasetCache#testPageRounder.

      We should adjust timeout time for these unit test which would failed sometimes due to timeout.

      Attachments

        1. HDFS-10197.002.patch
          2 kB
          Yiqun Lin
        2. HDFS-10197.001.patch
          3 kB
          Yiqun Lin

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            linyiqun Yiqun Lin
            linyiqun Yiqun Lin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment