Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-1849

Respect failed.volumes.tolerated on startup

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • 0.23.0
    • datanode
    • None

    Description

      The current failed.volumes.tolerated behavior is not user friendly, datanodes can be configured to tolerate N volume failures and still offer service, but if the cluster is restarted all the datanodes with failed volumes will not start unless the failed volumes have been removed from the hdfs configuration files on the respective hosts.

      The failed.volumes.tolerated configuration option should be respected on startup. The datanode should only refuse to startup if more than failed.volumes.tolerated (HDFS-1161) have failed, or if a configured critical volume (HDFS-1848) has failed (which is probably not an issue in practice since dn startup probably fails eg if the root volume has gone readonly).

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            eli Eli Collins
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment