[ZOOKEEPER-3399] Remove logging in getGlobalOutstandingLimit for optimal performance. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.6.0
Fix Version/s: 3.6.0
Component/s: server
Labels:
- pull-request-available

Description

Recently we have moved some of our production clusters to the top of the trunk. One issue we found is a performance regression on read and write latency on the clusters where the quorum is also serving traffic. The average read latency increased by 50x, p99 read latency increased by 300x.

The root cause is a log statement introduced in ~~ZOOKEEPER-3177~~ (PR711), where we added a LOG.info statement in getGlobalOutstandingLimit. getGlobalOutstandingLimit is on the critical code path for request processing and for each request, it will be called twice (one at processing the packet, one at finalizing the request response). This not only degrades performance of the server, but also bloated the log file, when the QPS of a server is high.

This only impacts clusters when the quorum (leader + follower) is serving traffic. For clusters where only observers are serving traffic no impact is observed.

Attachments

Issue Links

links to

GitHub Pull Request #956

Activity

People

Assignee:: Michael Han

Reporter:: Michael Han

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 22/May/19 01:42

Updated:: 25/May/19 20:27

Resolved:: 25/May/19 17:46

Time Tracking

Estimated:

Not Specified

Remaining:

Logged: