ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-528

c client exists() call with watch on large number of nodes (>100k) causes connection loss

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Invalid
    • Affects Version/s: 3.2.0
    • Fix Version/s: 3.3.0
    • Component/s: c client
    • Labels:
      None
    • Release Note:
      workaround: the test environment in this case had a max heap of 64m, by increasing the max mem via -Xmx the performance issue was addressed and the test ran fine.

      Description

      If I create 100k nodes on /misc then

      CPPUNIT_ASSERT_EQUAL(0, zoo_get_children(zh2, "/misc", 0, &children));
      for (int i = 0; i < children.count; i++)

      { sprintf(path, "/misc/%s", children.data[i]); CPPUNIT_ASSERT_EQUAL(0, zoo_exists(zh2, path, 1, &stat)); CPPUNIT_ASSERT_EQUAL(0, zoo_wexists(zh3, path, watcher, &ctx3, &stat)); }

      around 47k or so through the loop the client fails with -4 (connection loss), the client timeout is 30 seconds. The server command port shows the following, so it looks like it's not the server but some issue with watcher reg on the c client?

      phunt@valhalla:~$ echo stat | nc localhost 22181
      Zookeeper version: 3.3.0--1, built on 07/22/2009 23:55 GMT
      Clients:
      /127.0.0.1:45729[1](queued=0,recved=100024,sent=0)
      /127.0.0.1:50229[1](queued=0,recved=0,sent=0)
      /127.0.0.1:45731[1](queued=0,recved=47116,sent=0)
      /127.0.0.1:45730[1](queued=0,recved=47117,sent=1)

      Latency min/avg/max: 0/196/1026
      Received: 194257
      Sent: 1
      Outstanding: 0
      Zxid: 0x186a4
      Mode: standalone
      Node count: 100005

      729 is a separate client - the one that created the nodes originally.

      731 and 730 are zh2/zh3 in the code.

        Activity

        Hide
        Patrick Hunt added a comment -

        This was being caused by gc pressure due to ZOOKEEPER-536

        Closing, will be fixed in ZOOKEEPER-536

        Show
        Patrick Hunt added a comment - This was being caused by gc pressure due to ZOOKEEPER-536 Closing, will be fixed in ZOOKEEPER-536

          People

          • Assignee:
            Patrick Hunt
            Reporter:
            Patrick Hunt
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development