Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.0.0
    • Fix Version/s: 1.0.2
    • Component/s: namenode
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      When a storage directory is inaccessible, namenode removes it from the valid storage dir list to a removedStorageDirs list. Those storage directories will not be restored when they become healthy again.

      The proposed solution is to restore the previous failed directories at the beginning of checkpointing, say, rollEdits, by copying necessary metadata files from healthy directory to unhealthy ones. In this way, whenever a failed storage directory is recovered by the administrator, he/she can immediately force a checkpointing to restored a failed directory.

      See also HADOOP-4885.

        Issue Links

          Activity

          Hide
          Uma Maheswara Rao G added a comment -

          Hi Brandon,
          It seems to me that you are looking for the same issue(HADOOP-4885) which is already addressed right?
          Also we have the property to enable or disable that feature "dfs.namenode.name.dir.restore".
          Are you talking about some other issue/improvement here?

          Show
          Uma Maheswara Rao G added a comment - Hi Brandon, It seems to me that you are looking for the same issue( HADOOP-4885 ) which is already addressed right? Also we have the property to enable or disable that feature "dfs.namenode.name.dir.restore". Are you talking about some other issue/improvement here?
          Hide
          Eli Collins added a comment -

          This is a dupe of HDFS-2781. Brandon, feel free to post a patch there.

          Show
          Eli Collins added a comment - This is a dupe of HDFS-2781 . Brandon, feel free to post a patch there.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          @Uma, you are right that HADOOP-4885 already has fixed this. So this one is a backport. Will revise the title.

          @Eli, this is not a dupe of HDFS-2781.

          Show
          Tsz Wo Nicholas Sze added a comment - @Uma, you are right that HADOOP-4885 already has fixed this. So this one is a backport. Will revise the title. @Eli, this is not a dupe of HDFS-2781 .
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Brandon already has posted a patch on HADOOP-4885. He also has run all the unit tests.

          Jitendra and I have reviewed the patch.

          Show
          Tsz Wo Nicholas Sze added a comment - Brandon already has posted a patch on HADOOP-4885 . He also has run all the unit tests. Jitendra and I have reviewed the patch.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I have committed this (the patch was posted on HADOOP-4885.) Thanks, Brandon!

          Show
          Tsz Wo Nicholas Sze added a comment - I have committed this (the patch was posted on HADOOP-4885 .) Thanks, Brandon!
          Hide
          Eli Collins added a comment -

          Sorry, posted to the wrong jira!

          Show
          Eli Collins added a comment - Sorry, posted to the wrong jira!

            People

            • Assignee:
              Brandon Li
              Reporter:
              Brandon Li
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development