Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.0.0
    • Fix Version/s: 1.0.2
    • Component/s: namenode
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      When a storage directory is inaccessible, namenode removes it from the valid storage dir list to a removedStorageDirs list. Those storage directories will not be restored when they become healthy again.

      The proposed solution is to restore the previous failed directories at the beginning of checkpointing, say, rollEdits, by copying necessary metadata files from healthy directory to unhealthy ones. In this way, whenever a failed storage directory is recovered by the administrator, he/she can immediately force a checkpointing to restored a failed directory.

      See also HADOOP-4885.

        Issue Links

          Activity

          Matt Foley made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Brandon Li made changes -
          Link This issue is related to HDFS-3127 [ HDFS-3127 ]
          Matt Foley made changes -
          Fix Version/s 1.1.0 [ 12317959 ]
          Suresh Srinivas made changes -
          Fix Version/s 1.0.2 [ 12320051 ]
          Affects Version/s 1.0.0 [ 12318243 ]
          Target Version/s 1.0.2 [ 12320051 ]
          Tsz Wo Nicholas Sze made changes -
          Affects Version/s 0.24.0 [ 12317653 ]
          Affects Version/s 1.1.0 [ 12317959 ]
          Tsz Wo Nicholas Sze made changes -
          Status Reopened [ 4 ] Resolved [ 5 ]
          Hadoop Flags Reviewed [ 10343 ]
          Fix Version/s 1.1.0 [ 12317959 ]
          Resolution Fixed [ 1 ]
          Tsz Wo Nicholas Sze made changes -
          Link This issue relates to HADOOP-4885 [ HADOOP-4885 ]
          Tsz Wo Nicholas Sze made changes -
          Summary Add mechanism to restore the removed storage directories Backport HADOOP-4885 to branch-1
          Description When a storage directory is inaccessible, namenode removes it from the valid storage dir list to a removedStorageDirs list. Those storage directories will not be restored when they become healthy again.

          The proposed solution is to restore the previous failed directories at the beginning of checkpointing, say, rollEdits, by copying necessary metadata files from healthy directory to unhealthy ones. In this way, whenever a failed storage directory is recovered by the administrator, he/she can immediately force a checkpointing to restored a failed directory.
          When a storage directory is inaccessible, namenode removes it from the valid storage dir list to a removedStorageDirs list. Those storage directories will not be restored when they become healthy again.

          The proposed solution is to restore the previous failed directories at the beginning of checkpointing, say, rollEdits, by copying necessary metadata files from healthy directory to unhealthy ones. In this way, whenever a failed storage directory is recovered by the administrator, he/she can immediately force a checkpointing to restored a failed directory.

          See also HADOOP-4885.
          Tsz Wo Nicholas Sze made changes -
          Resolution Duplicate [ 3 ]
          Status Resolved [ 5 ] Reopened [ 4 ]
          Eli Collins made changes -
          Field Original Value New Value
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Duplicate [ 3 ]
          Brandon Li created issue -

            People

            • Assignee:
              Brandon Li
              Reporter:
              Brandon Li
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development