[ZOOKEEPER-1177] Enabling a large number of watches for a large number of clients - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.3.3
Fix Version/s: 3.6.0
Component/s: server
Labels:
- pull-request-available

Hadoop Flags:

Reviewed
Release Note:

Hide
Changes to the watch manager to support very large (200 million) watches. This change also improves the synchronization in the WatchManager to reduce the contention on various watch manager operations (mainly addWatch() which is a fairly common operation after trigger watch).

Show
Changes to the watch manager to support very large (200 million) watches. This change also improves the synchronization in the WatchManager to reduce the contention on various watch manager operations (mainly addWatch() which is a fairly common operation after trigger watch).

Description

In my ZooKeeper, I see watch manager consuming several GB of memory and I dug a bit deeper.

In the scenario I am testing, I have 10K clients connected to an observer. There are about 20K znodes in ZooKeeper, each is about 1K - so about 20M data in total.
Each client fetches and puts watches on all the znodes. That is 200 million watches.

It seems a single watch takes about 100 bytes. I am currently at 14528037 watches and according to the yourkit profiler, WatchManager has 1.2 G already. This is not going to work as it might end up needing 20G of RAM just for the watches.

So we need a more compact way of storing watches. Here are the possible solutions.
1. Use a bitmap instead of the current hashmap. In this approach, each znode would get a unique id when its gets created. For every session, we can keep track of a bitmap that indicates the set of znodes this session is watching. A bitmap, assuming a 100K znodes, would be 12K. For 10K sessions, we can keep track of watches using 120M instead of 20G.
2. This second idea is based on the observation that clients watch znodes in sets (for example all znodes under a folder). Multiple clients watch the same set and the total number of sets is a couple of orders of magnitude smaller than the total number of znodes. In my scenario, there are about 100 sets. So instead of keeping track of watches at the znode level, keep track of it at the set level. It may mean that get may also need to be implemented at the set level. With this, we can save the watches in 100M.

Are there any other suggestions of solutions?

Thanks

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

ZooKeeper.patch
14/Oct/11 16:55
20 kB
Vikas Mehta
ZOOKEEPER-1177.patch
04/Jan/12 19:48
29 kB
Patrick D. Hunt
ZOOKEEPER-1177.patch
30/Dec/11 01:27
20 kB
Patrick D. Hunt
Zookeeper-after-resolving-merge-conflicts.patch
16/Nov/11 02:23
20 kB
Vikas Mehta
ZooKeeper-with-fix-for-findbugs-warning.patch
16/Nov/11 18:59
20 kB
Vikas Mehta

Issue Links

links to

GitHub Pull Request #590

Activity

People

Assignee:: Fangmin Lv

Reporter:: Vishal Kathuria

Votes:: 1 Vote for this issue

Watchers:: 21 Start watching this issue

Dates

Created:: 09/Sep/11 22:59

Updated:: 26/Apr/24 06:26

Resolved:: 28/Sep/18 21:38

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

14h 40m