Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-12267

Ambari to improve tracking of data dirs becoming unmounted

    XMLWordPrintableJSON

Details

    • Story
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.0.0
    • trunk
    • ambari-agent
    • None

    Description

      Ambari keeps track of a file, /etc/hadoop/conf/dfs_data_dir_mount.hist
      that contains a mapping of HDFS data dirs to the last known mount point.
      This is used to detect when a data dir becomes unmounted, in order to prevent HDFS from writing to the root partition.
      Consider the example of a data node configured with these volumes:
      /dev/sda -> /
      /dev/sdb -> /grid/0
      /dev/sdc -> /grid/1
      /dev/sdd -> /grid/2
      Typically, each /grid/#/ directory contains a data folder.

      If hdfs-site contains dfs.datanode.failed.volumes.tolerated with a value > 0, then DataNode will tolerate the failure, otherwise, the DataNode will die.

      In AMBARI-12252, I fixed a bug so that Ambari would prevent an unmounted drive from allowing HDFS to write to the root partition.
      However, this approach relies on the /etc/hadoop/conf/dfs_data_dir_mount.hist file existing, and the original configuration being correct.

      The ideal way to fix this is,

      • Track which data dirs the admin wants mounted on a non-root partition.

        If the admin wishes all data dirs to be on non-root mounts, but the initial install is incorrect, then this should be reported as a problem.

      • Keep the history of the mount points in the database.

        Today, if the cache file is deleted or the host reimaged, then this information is lost.

      • Introduce a new state between FAILED and COMPLETED.

        such as COMPLETED_WITH_ERRORS, that will allow tasks to look differently in the UI, so the user can clearly detect when a critical but non fatal error happened.

      • Plugin with Alert Framework

      Attachments

        Issue Links

          Activity

            People

              afernandez Alejandro Fernandez
              afernandez Alejandro Fernandez
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: