Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-19218

Avoid DNS lookup while creating IPC Connection object

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      Been running HADOOP-18628 in production for quite sometime, everything works fine as long as DNS servers in HA are available. Upgrading single NS server at a time is also a common case, not problematic. Every DNS lookup takes 1ms in general.

      However, recently we encountered a case where 2 out of 4 NS servers went down (temporarily but it's a rare case). With small duration DNS cache and 2s of NS fallback timeout configured in resolv.conf, now any client performing DNS lookup can encounter 4s+ delay. This caused namenode outage as listener thread is single threaded and it was not able to keep up with large num of unique clients (in direct proportion with num of DNS resolutions every few seconds) initiating connection on listener port.

      While having 2 out of 4 DNS servers offline is rare case and NS fallback settings could also be improved, it is important to note that we don't need to perform DNS resolution for every new connection if the intention is to improve the insights into VersionMistmatch errors thrown by the server.

      The proposal is the delay the DNS resolution until the server throws the error for incompatible header or version mismatch. This would also help with ~1ms extra time spent even for healthy DNS lookup.

      Attachments

        Activity

          People

            vjasani Viraj Jasani
            vjasani Viraj Jasani
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: