Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
Add a mechanism to allow queries to use only a subset of replicas(by specifying the wanted replica tag).
Replicas have to be marked with tags before running the query.
Setup needed from the replica side
Set the required properties to the required values in at least one replica.
Setup needed from the query side
A query has to specify ShardParams.FILTER_BY_REPLICA_PROPERTY to specify that it is indeed interested in replica property filtering.
Then it should specify ShardParams.SHARDS_FILTER or ShardParams.SHARDS_FILTERNOT set to ShardParams.REPLICA_PROP followed by the property that has to be checked followed by ":" and then the value wanted.
Excample:
Given that some replicas have a property named region:
Adding the following params to the query:
filterByReplicaProp=true&shards.filter=replicaProp.region:EMEA
will ensure that the query uses replicas that have the property region set to EMEA
filterByReplicaProp=true&shards.filterNot=replicaProp.region:EMEA
will ensure that the query does not use replica that have the property region set to EMEA
An example can be seen in the ReplicaTagTest included in this patch where a dynamic cloud has some tags assigned to it both randomly and on a fixed basis.
A replica can have multiple tags attached to it, and these tags are separated by default by "|"(pipe character), the delimiter can be changed by setting ShardParams.REPLICA_TAG_DELIMITER in the query to anything else.
The ShardParams.FILTER_BY_REPLICA_PROPERTY is needed because the computation required to filter by property:value is quite complex and queries that don't care about replica filtering should not incur into the performance penalty.
The ShardParams.REPLICA_PROP (currently set to replicaProp. is needed to ensure that the system is extensible in the future.
Usage warnings
Using ShardParams.SHARDS_FILTER or ShardParams.SHARDS_FILTERNOT set to ShardParams.REPLICA_PROP without ShardParams.FILTER_BY_REPLICA_PROPERTY will cause the QueryComponent to throw exceptions.
Using ShardParams.FILTER_BY_REPLICA_PROPERTY without filters will not cause any error, but will likely waste computation time.
No validity check is performed on the tags, therefore one may get an array of shard URLs that contains empty URLs, or that is null(when the property does not exist), the user of this feature has to deal with it.
Attachments
Attachments
Issue Links
- contains
-
SOLR-10881 Allow components to specify a different method to resolve shards
- Open
- relates to
-
SOLR-10610 Add CanaryComponent, a search component to analyse requests
- Open