Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-11340

DataNode reconfigure for disks doesn't remove the failed volumes

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0-alpha1
    • 2.9.0, 3.0.0-alpha4
    • None
    • None

    Description

      Say a DataNode (uuid:xyz) has disks D1 and D2. When D1 turns bad, JMX query on FSDatasetState-xyz for "NumFailedVolumes" attr rightly shows the failed volume count as 1 and the "FailedStorageLocations" attr has the failed storage location as "D1".

      It is possible to add or remove disks to this DataNode by running reconfigure command. Let the failed disk D1 be removed from the conf and the new conf has only one good disk D2. Upon running the reconfigure command for this DataNode with this new disk conf, the expectation is DataNode would no more have "NumFailedVolumes" or "FailedStorageLocations". But, even after removing the failed disk from the conf and a successful reconfigure, DataNode continues to show the "NumFailedVolumes" as 1 and "FailedStorageLocations" as "D1" and it never gets reset.

      Attachments

        1. HDFS-11340-branch-2.01.patch
          20 kB
          Manoj Govindassamy
        2. HDFS-11340.05.patch
          17 kB
          Manoj Govindassamy
        3. HDFS-11340.04.patch
          17 kB
          Manoj Govindassamy
        4. HDFS-11340.03.patch
          17 kB
          Manoj Govindassamy
        5. HDFS-11340.02.patch
          18 kB
          Manoj Govindassamy
        6. HDFS-11340.01.patch
          18 kB
          Manoj Govindassamy

        Activity

          People

            manojg Manoj Govindassamy
            manojg Manoj Govindassamy
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: