Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12910

Secure Datanode Starter should log the port when it fails to bind

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.1.0
    • 3.1.0, 3.0.1
    • datanode
    • None
    • Reviewed

    Description

      When running a secure data node, the default ports it uses are 1004 and 1006. Sometimes other OS services can start on these ports causing the DN to fail to start (eg the nfs service can use random ports under 1024).

      When this happens an error is logged by jsvc, but it is confusing as it does not tell you which port it is having issues binding to, for example, when port 1004 is used by another process:

      Initializing secure datanode resources
      java.net.BindException: Address already in use
              at sun.nio.ch.Net.bind0(Native Method)
              at sun.nio.ch.Net.bind(Net.java:433)
              at sun.nio.ch.Net.bind(Net.java:425)
              at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
              at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
              at org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.getSecureResources(SecureDataNodeStarter.java:105)
              at org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.init(SecureDataNodeStarter.java:71)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:207)
      Cannot load daemon
      Service exit with a return value of 3
      

      And when port 1006 is used:

      Opened streaming server at /0.0.0.0:1004
      java.net.BindException: Address already in use
              at sun.nio.ch.Net.bind0(Native Method)
              at sun.nio.ch.Net.bind(Net.java:433)
              at sun.nio.ch.Net.bind(Net.java:425)
              at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
              at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
              at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67)
              at org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.getSecureResources(SecureDataNodeStarter.java:129)
              at org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.init(SecureDataNodeStarter.java:71)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:207)
      Cannot load daemon
      Service exit with a return value of 3
      

      We should catch the BindException exception and log out the problem address:port and then re-throw the exception to make the problem more clear.

      I will upload a patch for this.

      Attachments

        1. HDFS-12910.001.patch
          2 kB
          Stephen O'Donnell
        2. HDFS-12910.002.patch
          8 kB
          Nandakumar
        3. HDFS-12910.003.patch
          5 kB
          Stephen O'Donnell
        4. HDFS-12910.004.patch
          5 kB
          Stephen O'Donnell
        5. HDFS-12910.005.patch
          6 kB
          Stephen O'Donnell
        6. HDFS-12910.006.patch
          6 kB
          Stephen O'Donnell

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            sodonnell Stephen O'Donnell
            sodonnell Stephen O'Donnell
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment