ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-1998

C library calls getaddrinfo unconditionally from zookeeper_interest

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Critical Critical
    • Resolution: Unresolved
    • Affects Version/s: 3.5.0
    • Fix Version/s: 3.5.1
    • Component/s: c client
    • Labels:
      None

      Description

      (commented this on ZOOKEEPER-338)

      I've just noticed that we call getaddrinfo from zookeeper_interest... on every call. So from zookeeper_interest we always call update_addrs:

      https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L2082

      which in turns unconditionally calls resolve_hosts:

      https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L787

      which does the unconditional calls to getaddrinfo:

      https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L648

      We should fix this since it'll make 3.5.0 slower for people relying on DNS. I think this is happened as part of ZOOKEEPER-107 in which the list of servers can be updated.

      cc: Alexander Shraer, Patrick Hunt, Flavio Junqueira

        Activity

        Hide
        Hongchao Deng added a comment -

        hey Raul Gutierrez Segales.
        One suggestion – when pointing Github link don't use "trunk" use specific commit hash in the URL. This would make it point to where you meant when trunk commits change.

        Show
        Hongchao Deng added a comment - hey Raul Gutierrez Segales . One suggestion – when pointing Github link don't use "trunk" use specific commit hash in the URL. This would make it point to where you meant when trunk commits change.
        Hide
        Marshall McMullen added a comment -

        Raul Gutierrez Segales - Looking at the 3.4 code I agree with you. It seems like we should only do the lookup when we are connecting.

        Show
        Marshall McMullen added a comment - Raul Gutierrez Segales - Looking at the 3.4 code I agree with you. It seems like we should only do the lookup when we are connecting.
        Hide
        Raul Gutierrez Segales added a comment -

        I actually didn't look at 3.4! Since the code looked new, I just guessed. But, it turns out the that getaddrs is only called from zookeeper_init in 3.4 (unless I am overlooking something):

        https://github.com/apache/zookeeper/blob/branch-3.4/src/c/src/zookeeper.c#L836

        Do you have thoughts on how we could avoid this? I suppose we could easily just check if the addrvec is the same and if it is bypass resolving the hosts.

        Well, but lookups should only happen upon reconnects right? When things are healthy (i.e.: connected), it doesn't matter if we have (even new) unresolved hosts. So I'd say we only call resolve_hosts() when a new connection is needed.. Thoughts?

        Show
        Raul Gutierrez Segales added a comment - I actually didn't look at 3.4! Since the code looked new, I just guessed. But, it turns out the that getaddrs is only called from zookeeper_init in 3.4 (unless I am overlooking something): https://github.com/apache/zookeeper/blob/branch-3.4/src/c/src/zookeeper.c#L836 Do you have thoughts on how we could avoid this? I suppose we could easily just check if the addrvec is the same and if it is bypass resolving the hosts. Well, but lookups should only happen upon reconnects right? When things are healthy (i.e.: connected), it doesn't matter if we have (even new) unresolved hosts. So I'd say we only call resolve_hosts() when a new connection is needed.. Thoughts?
        Hide
        Marshall McMullen added a comment -

        Raul Gutierrez Segales - yep, you're right. I added that code as part of ZOOKEEPER-107 working with Alexander Shraer. But if I recall correctly, the original code also unconditionally called resolve_hosts. Though I'd have to go look at the original code to confirm that. I'm guessing you've done that already and that it did not do that?

        Do you have thoughts on how we could avoid this? I suppose we could easily just check if the addrvec is the same and if it is bypass resolving the hosts.

        Show
        Marshall McMullen added a comment - Raul Gutierrez Segales - yep, you're right. I added that code as part of ZOOKEEPER-107 working with Alexander Shraer . But if I recall correctly, the original code also unconditionally called resolve_hosts. Though I'd have to go look at the original code to confirm that. I'm guessing you've done that already and that it did not do that? Do you have thoughts on how we could avoid this? I suppose we could easily just check if the addrvec is the same and if it is bypass resolving the hosts.

          People

          • Assignee:
            Raul Gutierrez Segales
            Reporter:
            Raul Gutierrez Segales
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Development