Description
Currently, Solr chooses a random replica for each shard to fan out the query request. However, this presents a problem when running Solr in multiple availability zones.
If one availability zone fails then it affects all Solr nodes because they will try to connect to Solr nodes in the failed availability zone until the request times out. This can lead to a build up of threads on each Solr node until the node goes out of memory. This results in a cascading failure.
This issue try to solve this problem by adding
- another shardPreference param named node.sysprop, so the query will be routed to nodes with same defined system properties as the current one.
- default shardPreferences on the whole cluster, which will be stored in /clusterprops.json.
- a cacher for fetching other nodes system properties whenever /live_nodes get changed.
Attachments
Attachments
Issue Links
- is required by
-
SOLR-14511 Add documentation for node.sysprop shard preference
- Closed