Uploaded image for project: 'Traffic Server'
  1. Traffic Server
  2. TS-307

Possible performance problem: DNS lookup continuation is using first Network ethread for all operations

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 2.1.6
    • DNS
    • None

    Description

      (from yahoo bug 989959)

      Original description
      by Vladimir Legalov 3 years ago at 2006-12-18 11:57

      All DNS lookup operations are executing on the first Network thread. Since each Network thread is already responsible
      for NetAccept & NetHandler continuation processing, DNS processing can cause extra CPU usage and additional delays
      for
      this particular thread. It make sense to extract DNS processing as absolutely independent thread (ethread) to avoid
      possible performance problem related to
      DNS lookups.
      Such performance problem can be visible only in "no caching" mode with very high rate of OS requests.
      Additional performance testing is required to clarify visibility of this problem.
      (It looks like htop is not an appropriate tool to catch precise CPU usage per thread.)

      Comment 1
      by Leif Hedstrom 3 years ago at 2006-12-26 13:41:10

      I think it's highly unlikely that DNS will ever become a bottleneck. Even under extreme cases, like say 300 Origin
      Servers all with a TTL of 5 minutes (we rarely have anything shorter), we're looking at one DNS lookup per second
      (assuming there are no cache hits, as pointed out already).

      I'm closing this bug until we have some real evidence that DNS lookups is ever going to be any sort of bottleneck.

      Comment 2
      by Vladimir Legalov 3 years ago at 2006-12-26 20:31:17

      I don't understand why we should not keep this RFE open. I would prefer to keep DNS lookup code as separate thread not
      because of a huge performance impact but because the DNS lookup continuation is activated every 11 milliseconds (just
      to verify the status of the 32 UDP sockets) even if we don't need to do perform a DNS lookup. One more thing - this
      continuation is impacting eThread scheduling for first NetHandler continuation.
      I am 100% sure that all NetHandler continuations must be symmetrical/equal and have similar scheduling. I would prefer
      to reopen this RFE.

      Comment 3
      by Ryan Troll 3 years ago at 2006-12-27 06:47:47

      Reopened, with very low priority.

      I'd recommend waiting until the bigger items are done before tackling this. Yes, we may be spending time in DNS in
      this thread when we don't need to; and maybe a single DNS thread is the right answer. Or maybe modifying the DNS code
      to not bother with DNS continuations unless there are outstanding DNS requests makes more sense.

      However, I'd wait on this until we have time to go back and tune it. It may squeeze a little more performance out of
      the stack, but I suspect there are bigger wins to be gained through enhancements that are being actively requested by
      properties; or through enhancements we've already identified.

      It makes sense to keep this open so we don't forget about it. Hopefully we'll get to it later this year.

      Comment 4
      by Leif Hedstrom 3 years ago at 2006-12-27 07:42:47

      The reason I closed this bug was that the bug report indicated that this would be a problem under heavy load, with no
      caching. I don't believe that to be the case. In best case DNS lookups will be of O(1) complexity, and worst case it'd
      be O, where n is the number of origin servers. In either of those case, performing the actualy DNS lookups will be
      negligible as far as CPU consumption is concerned.

      However, with the comment from Vlad, it seems the concern is about wasting time on the DNS continuation, which I agree
      might be worth investigating. But I'd also like to see some benchmarks on how much this does affect us today. I'm not
      sure exactly how to test this. Vlad, is it possible to increase the timer for the DNS continuation to get scheduled,
      e.g. have it run every 1 second? Then we could easily benchmark what effect that has on performance.

      Comment 5
      by Vladimir Legalov 3 years ago at 2006-12-27 19:09:23

      The existence of this RFE does not mean that it will be taken on our development table immediately. It is a reminder
      only.
      As I already mentioned in the initial comments for this RFE: "Additional performance testing is required to
      clarify visibility of this problem."
      We have plenty of similar RFE's by priority and severity, which are not in active development. I was sure that P4 is
      clear evidence of such 'dormant' status.

      Attachments

        Activity

          People

            zwoop Leif Hedstrom
            mlibbey Miles Libbey
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: