Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-11985 Improve Solaris support in Hadoop
  3. HADOOP-12488

DomainSocket: Solaris does not support timeouts on AF_UNIX sockets

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • 2.7.1
    • None
    • net
    • None
    • Solaris

    Description

      From the hadoop-common-dev mailing list:

      http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201509.mbox/%3C560B99F6.6010408@oracle.com%3E
      http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201510.mbox/%3C560EA6BF.2070001@oracle.com%3E

      Now that the Hadoop native code builds on Solaris I've been chipping
      away at all the test failures. About 50% of the failures involve
      DomainSocket, either directly or indirectly. That seems to be mainly
      because the tests use DomainSocket to do single-node testing, whereas in
      production it seems that DomainSocket is less commonly used
      (https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html).

      The particular problem on Solaris is that socket read/write timeouts
      (the SO_SNDTIMEO and SO_RCVTIMEO socket options) are not supported for
      UNIX domain (PF_UNIX) sockets. Those options are however supported for
      PF_INET sockets. That's because the socket implementation on Solaris is
      split roughly into two parts, for inet sockets and for STREAMS sockets,
      and the STREAMS implementation lacks support for SO_SNDTIMEO and
      SO_RCVTIMEO. As an aside, performance of sockets that use loopback or
      the host's own IP is slightly better than that of UNIX domain sockets on
      Solaris.

      I'm investigating getting timeouts supported for PF_UNIX sockets added
      to Solaris, but in the meantime I'm also looking how this might be
      worked around in Hadoop. One way would be to implement timeouts by
      wrapping all the read/write/send/recv etc calls in DomainSocket.c with
      either poll() or select().

      The basic idea is to add two new fields to DomainSocket.c to hold the
      read/write timeouts. On platforms that support SO_SNDTIMEO and
      SO_RCVTIMEO these would be unused as setsockopt() would be used to set
      the socket timeouts. On platforms such as Solaris the JNI code would use
      the values to implement the timeouts appropriately.

      To prevent the code in DomainSocket.c becoming a #ifdef hairball, the
      current socket IO function calls such as accept(), send(), read() etc
      would be replaced with a macros such as HD_ACCEPT. On platforms that
      provide timeouts these would just expand to the normal socket functions,
      on platforms that don't support timeouts it would expand to wrappers
      that implements timeouts for them.

      The only caveats are that all code that does anything to a PF_UNIX
      socket would always have to do so via DomainSocket. As far as I can
      tell that's not an issue, but it would have to be borne in mind if any
      changes were made in this area.

      Before I set about doing this, does the approach seem reasonable?

      Unfortunately it's not a simple as I'd hoped. For some reason I don't
      really understand, nearly all the JNI methods are declared as static and
      therefore don't get a "this" pointer and as a consequence all the class
      data members that are needed by the JNI code have to be passed in as
      parameters. That also means it's not possible to store the timeouts in
      the DomainSocket fields from within the JNI code. Most of the JNI
      methods should be instance methods rather than static ones, but making
      that change would require some significant surgery to DomainSocket.

      Attachments

        1. HADOOP-12488.001.patch
          33 kB
          Alan Burlison
        2. HADOOP-12488.002.patch
          34 kB
          Alan Burlison

        Issue Links

          Activity

            People

              alanburlison Alan Burlison
              alanburlison Alan Burlison
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: