Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-1623 High Availability Framework for HDFS NN
  3. HDFS-2915

HA: TestFailureOfSharedDir.testFailureOfSharedDir() has race condition

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • HA branch (HDFS-1623)
    • HA branch (HDFS-1623)
    • namenode
    • None

    Description

      The test deletes the shared edits dir to simulate a failure. Then it calls rollEditLogs() to trigger the deleted dir to be used and fail with an IOException. Unfortunately, deleting the shared dir can put the NN in safe mode due to lack of space. This causes a SafeModeException to be thrown when rollEditDirs() is called. This exception is caught as an IOException in the test but the associated assert in the catch block fails.

      This always happens in the debugger because the delay in stepping through causes the safe mode change to happen before rollEditLogs() gets called.

      Attachments

        1. HDFS-2915.HDFS-1623.patch
          2 kB
          Bikas Saha
        2. HDFS-2915.HDFS-1623.patch
          2 kB
          Bikas Saha

        Activity

          People

            bikassaha Bikas Saha
            bikassaha Bikas Saha
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: