Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-14201

Ability to disallow safemode NN to become active

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.1.1, 2.9.2
    • 3.3.0
    • auto-failover
    • Reviewed

    Description

      Currently with HA, Namenode in safemode can be possibly selected as active, for availability of both read and write, Namenodes not in safemode are better choices to become active though.

      It can take tens of minutes for a cold started Namenode to get out of safemode, especially when there are large number of files and blocks in HDFS, that means if a Namenode in safemode become active, the cluster will be not fully functioning for quite a while, even if it can while there is some Namenode not in safemode.

      The proposal here is to add an option, to allow Namenode to report itself as UNHEALTHY to ZKFC, if it's in safemode, so as to only allow fully functioning Namenode to become active, improving the general availability of the cluster.

      Attachments

        1. HDFS-14201.009.patch
          13 kB
          Xiaoqiao He
        2. HDFS-14201.008.patch
          13 kB
          Xiaoqiao He
        3. HDFS-14201.007.patch
          11 kB
          Xiaoqiao He
        4. HDFS-14201.006.patch
          11 kB
          Xiaoqiao He
        5. HDFS-14201.005.patch
          11 kB
          Xiaoqiao He
        6. HDFS-14201.004.patch
          10 kB
          Xiao Liang
        7. HDFS-14201.003.patch
          7 kB
          Xiao Liang
        8. HDFS-14201.002.patch
          5 kB
          Xiaoqiao He
        9. HDFS-14201.001.patch
          3 kB
          Xiaoqiao He

        Issue Links

          Activity

            People

              surmountian Xiao Liang
              surmountian Xiao Liang
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: