[SOLR-11423] Overseer queue needs a hard cap (maximum size) that clients respect - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 7.2, 8.0
Component/s: SolrCloud
Labels:
None

Description

When Solr gets into pathological GC thrashing states, it can fill the overseer queue with literally thousands and thousands of queued state changes. Many of these end up being duplicated up/down state updates. Our production cluster has gotten to the 100k queued items level many times, and there's nothing useful you can do at this point except manually purge the queue in ZK. Recently, it hit 3 million queued items, at which point our entire ZK cluster exploded.

I propose a hard cap. Any client trying to enqueue a item when a queue is full would throw an exception. I was thinking maybe 10,000 items would be a reasonable limit. Thoughts?

Attachments

Issue Links

links to

GitHub Pull Request #257

Activity

People

Assignee:: Scott Blum

Reporter:: Scott Blum

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Dates

Created:: 29/Sep/17 19:35

Updated:: 02/Oct/19 17:24

Resolved:: 09/Dec/17 19:47