Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-457

better handling of volume failure in Data Node storage

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.20.203.0, 0.21.0
    • Component/s: datanode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Datanode can continue if a volume for replica storage fails. Previously a datanode resigned if any volume failed.

      Description

      Current implementation shuts DataNode down completely when one of the configured volumes of the storage fails.
      This is rather wasteful behavior because it decreases utilization (good storage becomes unavailable) and imposes extra load on the system (replication of the blocks from the good volumes). These problems will become even more prominent when we move to mixed (heterogeneous) clusters with many more volumes per Data Node.

      1. TestFsck.zip
        689 kB
        Tsz Wo Nicholas Sze
      2. jira.HDFS-457.branch-0.20-internal.patch
        16 kB
        Erik Steffl
      3. HDFS-457-y20.patch
        15 kB
        Konstantin Shvachko
      4. HDFS-457-3.patch
        29 kB
        Boris Shkolnik
      5. HDFS-457-2.patch
        28 kB
        Boris Shkolnik
      6. HDFS-457-2.patch
        29 kB
        Boris Shkolnik
      7. HDFS-457-2.patch
        29 kB
        Boris Shkolnik
      8. HDFS-457-1.patch
        29 kB
        Boris Shkolnik
      9. HDFS-457.patch
        29 kB
        Boris Shkolnik
      10. HDFS-457_20-append.patch
        27 kB
        Nicolas Spiegelberg
      11. HDFS_457.patch
        2 kB
        Jeff Zhang

        Issue Links

          Activity

          Boris Shkolnik created issue -
          Boris Shkolnik made changes -
          Field Original Value New Value
          Assignee Boris Shkolnik [ boryas ]
          Boris Shkolnik made changes -
          Attachment HDFS-457.patch [ 12413868 ]
          Boris Shkolnik made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Boris Shkolnik made changes -
          Attachment HDFS-457-1.patch [ 12415899 ]
          Boris Shkolnik made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Boris Shkolnik made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Boris Shkolnik made changes -
          Attachment HDFS-457-2.patch [ 12416492 ]
          Boris Shkolnik made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Boris Shkolnik made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Boris Shkolnik made changes -
          Attachment HDFS-457-2.patch [ 12416515 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment TestFsck.zip [ 12416614 ]
          Boris Shkolnik made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Boris Shkolnik made changes -
          Attachment HDFS-457-2.patch [ 12416633 ]
          Boris Shkolnik made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Boris Shkolnik made changes -
          Attachment HDFS-457-3.patch [ 12416807 ]
          Tsz Wo Nicholas Sze made changes -
          Hadoop Flags [Reviewed]
          Issue Type Bug [ 1 ] Improvement [ 4 ]
          Fix Version/s 0.21.0 [ 12314046 ]
          Tsz Wo Nicholas Sze made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Release Note Do not shutdown datanode if some, but not all, volumes fail.
          Resolution Fixed [ 1 ]
          Tsz Wo Nicholas Sze made changes -
          Link This issue relates to HDFS-612 [ HDFS-612 ]
          Robert Chansler made changes -
          Release Note Do not shutdown datanode if some, but not all, volumes fail. Datanode can continue if a volume for replica storage fails. Previously a datanode resigned if any volume failed.
          Robert Chansler made changes -
          Link This issue is related to HDFS-138 [ HDFS-138 ]
          Erik Steffl made changes -
          Ravi Phulari made changes -
          Link This issue relates to HDFS-811 [ HDFS-811 ]
          Nicolas Spiegelberg made changes -
          Affects Version/s 0.20-append [ 12315103 ]
          Nicolas Spiegelberg made changes -
          Attachment HDFS-457_20-append.patch [ 12446817 ]
          Todd Lipcon made changes -
          Affects Version/s 0.20-append [ 12315103 ]
          Jeff Zhang made changes -
          Attachment HDFS_457.patch [ 12448289 ]
          Jeff Hammerbacher made changes -
          Link This issue relates to HDFS-1273 [ HDFS-1273 ]
          Konstantin Shvachko made changes -
          Link This issue relates to HDFS-1158 [ HDFS-1158 ]
          Konstantin Shvachko made changes -
          Attachment HDFS-457-y20.patch [ 12450007 ]
          Tom White made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Todd Lipcon made changes -
          Fix Version/s 0.20.203.0 [ 12316150 ]

            People

            • Assignee:
              Boris Shkolnik
              Reporter:
              Boris Shkolnik
            • Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development