Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-10880

Support replica filtering by tag: shards.filter=replicaProp.region:EMEA

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      Add a mechanism to allow queries to use only a subset of replicas(by specifying the wanted replica tag).

      Replicas have to be marked with tags before running the query.

      Setup needed from the replica side
      Set the required properties to the required values in at least one replica.


      Setup needed from the query side

      A query has to specify ShardParams.FILTER_BY_REPLICA_PROPERTY to specify that it is indeed interested in replica property filtering.
      Then it should specify ShardParams.SHARDS_FILTER or ShardParams.SHARDS_FILTERNOT set to ShardParams.REPLICA_PROP followed by the property that has to be checked followed by ":" and then the value wanted.
      Excample:
      Given that some replicas have a property named region:

      Adding the following params to the query:
      filterByReplicaProp=true&shards.filter=replicaProp.region:EMEA
      will ensure that the query uses replicas that have the property region set to EMEA

      filterByReplicaProp=true&shards.filterNot=replicaProp.region:EMEA
      will ensure that the query does not use replica that have the property region set to EMEA


      An example can be seen in the ReplicaTagTest included in this patch where a dynamic cloud has some tags assigned to it both randomly and on a fixed basis.

      A replica can have multiple tags attached to it, and these tags are separated by default by "|"(pipe character), the delimiter can be changed by setting ShardParams.REPLICA_TAG_DELIMITER in the query to anything else.

      The ShardParams.FILTER_BY_REPLICA_PROPERTY is needed because the computation required to filter by property:value is quite complex and queries that don't care about replica filtering should not incur into the performance penalty.

      The ShardParams.REPLICA_PROP (currently set to replicaProp. is needed to ensure that the system is extensible in the future.

      Usage warnings

      Using ShardParams.SHARDS_FILTER or ShardParams.SHARDS_FILTERNOT set to ShardParams.REPLICA_PROP without ShardParams.FILTER_BY_REPLICA_PROPERTY will cause the QueryComponent to throw exceptions.

      Using ShardParams.FILTER_BY_REPLICA_PROPERTY without filters will not cause any error, but will likely waste computation time.

      No validity check is performed on the tags, therefore one may get an array of shard URLs that contains empty URLs, or that is null(when the property does not exist), the user of this feature has to deal with it.

      Attachments

        1. SOLR-10880.patch
          39 kB
          Domenico Fabio Marino
        2. SOLR-10880.patch
          36 kB
          Christine Poerschke
        3. SOLR-10880.patch
          29 kB
          Domenico Fabio Marino
        4. SOLR-10880.patch
          25 kB
          Domenico Fabio Marino
        5. SOLR-10880.patch
          25 kB
          Domenico Fabio Marino
        6. SOLR-10880.patch
          21 kB
          Domenico Fabio Marino
        7. SOLR-10880.patch
          22 kB
          Domenico Fabio Marino

        Issue Links

          Activity

            People

              Unassigned Unassigned
              dmarino Domenico Fabio Marino
              Votes:
              3 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated: