Hadoop Common
  1. Hadoop Common
  2. HADOOP-8191

SshFenceByTcpPort uses netcat incorrectly

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.23.3
    • Fix Version/s: 2.0.0-alpha
    • Component/s: ha
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Target Version/s:

      Description

      SshFencyByTcpPort currently assumes that the NN is listening on localhost. Typical setups have the namenode listening just on the hostname of the namenode, which would lead "nc -z" to not catch it.

      Here's an example in which the NN is running, listening on 8020, but doesn't respond to "localhost 8020".

      [root@xxx ~]# lsof -P -p 5286 | grep -i listen
      java    5286 root  110u  IPv4            1772357              TCP xxx:8020 (LISTEN)
      java    5286 root  121u  IPv4            1772397              TCP xxx:50070 (LISTEN)
      [root@xxx ~]# nc -z localhost 8020
      [root@xxx ~]# nc -z xxx 8020
      Connection to xxx 8020 port [tcp/intu-ec-svcdisc] succeeded!
      

      Here's the likely offending code:

              LOG.info(
                  "Indeterminate response from trying to kill service. " +
                  "Verifying whether it is running using nc...");
              rc = execCommand(session, "nc -z localhost 8020");
      

      Naively, we could rely on netcat to the correct hostname (since the NN ought to be listening on the hostname it's configured as), or just to use fuser. Fuser catches ports independently of what IPs they're bound to:

      [root@xxx ~]# fuser 1234/tcp
      1234/tcp:             6766  6768
      [root@xxx ~]# jobs
      [1]-  Running                 nc -l localhost 1234 &
      [2]+  Running                 nc -l rhel56-18.ent.cloudera.com 1234 &
      [root@xxx ~]# sudo lsof -P | grep -i LISTEN | grep -i 1234
      nc         6766      root    3u     IPv4            2563626                 TCP localhost:1234 (LISTEN)
      nc         6768      root    3u     IPv4            2563671                 TCP xxx:1234 (LISTEN)
      
      1. hdfs-3081.txt
        7 kB
        Todd Lipcon

        Activity

          People

          • Assignee:
            Todd Lipcon
            Reporter:
            Philip Zeyliger
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development