Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-2466

Client skips servers when trying to connect

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • None
    • None
    • c client
    • None

    Description

      I've been looking at Zookeeper_simpleSystem::testFirstServerDown and I observed the following behavior. The list of servers to connect contains two servers, let's call them S1 and S2. The client never connects, but the odd bit is the sequence of servers that the client tries to connect to:

      S1
      S2
      S1
      S1
      S1
      <keeps repeating S1>
      

      It intrigued me that S2 is only tried once and never again. Checking the code, here is what happens. Initially, zh->reconfig is 1, so in zoo_cycle_next_server we return an address from get_next_server_in_reconfig, which is taken from zh->addrs_new in this test case. The attempt to connect fails, and handle_error is invoked in the error handling path. handle_error actually invokes addrvec_next which changes the address pointer to the next server on the list.

      After two attempts, it decides that it has tried all servers in zoo_cycle_next_server and sets zh->reconfig to zero. Once zh->reconfig == 0, we have that each call to zoo_cycle_next_server moves the address pointer to the next server in zh->addrs. But, given that handle_error also moves the pointer to the next server, we end up moving the pointer ahead twice upon every failed attempt to connect, which is wrong.

      Attachments

        1. ZOOKEEPER-2466.patch
          4 kB
          Michael Han
        2. ZOOKEEPER-2466.patch
          5 kB
          Michael Han

        Issue Links

          Activity

            People

              hanm Michael Han
              fpj Flavio Paiva Junqueira
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated: