Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-16508

When the nn1 fails at very beginning, admin command that waits exist safe mode fails

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.3.1
    • None
    • tools
    • None

    Description

      The HA is enabled, and we have two NameNodes: nn1 and nn2.

      When starting the cluster, the nn1 fails at the very beginning, and nn2 transfers to active state. The culster can provide services normally.

      However, when we tried to get safe mode or wait exit safe mode, our dfsadmin command fails due to an IOException: cannot connect to nn1.

      The root cause seems locate in here:

      //DFSAdmin.class
      
      public void setSafeMode(String[] argv, int idx) throws IOException {
      
      …
      
      if (isHaEnabled) {
            String nsId = dfsUri.getHost();
            List<ProxyAndInfo<ClientProtocol>> proxies =
                HAUtil.getProxiesForAllNameNodesInNameservice(
                dfsConf, nsId, ClientProtocol.class);
            for (ProxyAndInfo<ClientProtocol> proxy : proxies) {
              ClientProtocol haNn = proxy.getProxy();
              //The code always queries from the first nn, i.e., nn1, and returns with IOException when nn1 fails.
              boolean inSafeMode = haNn.setSafeMode(action, false);
              if (waitExitSafe) {
                inSafeMode = waitExitSafeMode(haNn, inSafeMode);
              }
              System.out.println("Safe mode is " + (inSafeMode ? "ON" : "OFF")
                  + " in " + proxy.getAddress());
            }
          } 
      …
      }
      
      

      Actually, I'm curious that do we need to get/wait every namenode here when HA is enabled?

      Attachments

        Activity

          People

            Unassigned Unassigned
            willtoshare May
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: