Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-1849

Respect failed.volumes.tolerated on startup

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • 0.23.0
    • datanode
    • None

    Description

      The current failed.volumes.tolerated behavior is not user friendly, datanodes can be configured to tolerate N volume failures and still offer service, but if the cluster is restarted all the datanodes with failed volumes will not start unless the failed volumes have been removed from the hdfs configuration files on the respective hosts.

      The failed.volumes.tolerated configuration option should be respected on startup. The datanode should only refuse to startup if more than failed.volumes.tolerated (HDFS-1161) have failed, or if a configured critical volume (HDFS-1848) has failed (which is probably not an issue in practice since dn startup probably fails eg if the root volume has gone readonly).

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              eli Eli Collins
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: