Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-16108

Incorrect distribution of records in shards after a split with splitByKeyprefix, when using the CompositeId router with a router field defined

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 8.4
    • None
    • SolrCloud
    • None

    Description

      When a collection is created using the CompositeId router with a router field defined, and one of its shard contains records with the same routing key, and a split of its shard is performed with splitByKeyprefix parameter, we expect the records to be uniformly distributed between the two resulting shards.

      Instead, one shard contains no record, the other contains all the records.

      Steps to reproduce:

      docker network create solr-network
      # run in one terminal
      docker run -it -h solr1 --name solr1 --net solr-network -p 18983:8983 solr:8.4 /opt/solr/bin/solr -c -f
      # run in another terminal
      docker run -it -h solr2 --name solr2 --net solr-network -p 28983:8983 solr:8.4 /opt/solr/bin/solr -c -f -z solr1:9983
      #-----------------------------------------------------------------------------------------------
      # Works, documents are split between the 2 shards
      # Create collection with default compositeId router, routing key in the id, only one shard
      curl --request GET \
        --url 'http://localhost:18983/solr/admin/collections?action=CREATE&name=routing_by_id&numShards=1'
      # Create enough documents, they all have the same routing key (france!)
      for i in {0..100}
      do
        curl --request POST \
        --url http://localhost:18983/solr/routing_by_id/update/json/docs?commit=true \
        --header 'Content-Type: application/json' \
        --data "[{
          \"id\": \"france\!${i}0\",
          \"title_t\": \"hi\"
      }]"
      done
      # Check it is indexed correctly
      curl --request GET \
        --url 'http://localhost:18983/solr/routing_by_id/select?q=*%3A*'
      # Split the shard
      curl --request GET \
        --url 'http://localhost:18983/solr/admin/collections?action=SPLITSHARD&collection=routing_by_id&shard=shard1&splitByPrefix=true'
      # Check records in shard1_0 (~half of the documents there)
      curl --request GET \
        --url 'http://localhost:18983/solr/routing_by_id/select?q=*%3A*&shards=shard1_0'
      # Check records in shard1_1(~half of the documents there)
      curl --request GET \
        --url 'http://localhost:18983/solr/routing_by_id/select?q=*%3A*&shards=shard1_1'
      
      #-----------------------------------------------------------------------------------------------
      # Fails, does not split documents in both shards
      # Create collection with default compositeId router, routing key in the field "route_t", only one shard
      curl --request GET \
        --url 'http://localhost:18983/solr/admin/collections?action=CREATE&name=routing_by_field&numShards=1&router.field=route_t'
      # Create enough documents, they all have the same routing key (france!)
      for i in {0..100}
      do
        curl --request POST \
        --url http://localhost:18983/solr/routing_by_field/update/json/docs?commit=true \
        --header 'Content-Type: application/json' \
        --data "[{
          \"id\": \"${i}0\",
          \"title_t\": \"hi\",
          \"route_t\": \"france\"
      }]"
      done
      # Check it is indexed correctly
      curl --request GET \
        --url 'http://localhost:18983/solr/routing_by_field/select?q=*%3A*'
      # Split the shard
      curl --request GET \
        --url 'http://localhost:18983/solr/admin/collections?action=SPLITSHARD&collection=routing_by_field&shard=shard1&splitByPrefix=true'
      # Check records in shard1_0: no document!
      curl --request GET \
        --url 'http://localhost:18983/solr/routing_by_field/select?q=*%3A*&shards=shard1_0'
      # Check records in shard1_1: all documents!
      curl --request GET \
        --url 'http://localhost:18983/solr/routing_by_field/select?q=*%3A*&shards=shard1_1'
         

      Attachments

        Activity

          People

            Unassigned Unassigned
            mbrette Marc Brette
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: