Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-2966

TestNameNodeMetrics tests can fail under load

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 2.0.0-alpha
    • 2.0.2-alpha
    • test
    • None
    • OS/X running intellij IDEA, firefox, winxp in a virtualbox.

    Description

      I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of running the HDFS tests on a desktop with out enough memory for all the programs trying to run. Things got swapped out and the tests failed as the DN heartbeats didn't come in on time.

      the tests both rely on waitForDeletion() to block the tests until the delete operation has completed, but all it does is sleep for the same number of seconds as there are datanodes. This is too brittle -it may work on a lightly-loaded system, but not on a system under heavy load where it is taking longer to replicate than expect.

      Immediate fix: double, triple, the sleep time?
      Better fix: have the thread block until all the DN heartbeats have finished.

      Attachments

        1. HDFS-2966.patch
          4 kB
          Steve Loughran
        2. HDFS-2966.patch
          4 kB
          Steve Loughran
        3. HDFS-2966.patch
          4 kB
          Suresh Srinivas
        4. HDFS-2966.patch
          4 kB
          Steve Loughran

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            stevel@apache.org Steve Loughran
            stevel@apache.org Steve Loughran
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment