Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-15803

Allow AssignStrategy to process multiple AssignRequests with cross-coordination

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 9.0
    • SolrCloud
    • None

    Description

      When doing testing for SOLR-15795, I found that if you have an empty node when running the REPLACENODE command, then many times all replicas will be placed on that same node, even if it doesn't result in an even distribution in your cluster.

      When looking at the code, it made sense. The ReplaceNodeCmd goes through a loop for every replica on the sourceNode, and uses the AssignStrategy class to assign a node for each replica, using the clusterstate. However, the clusterstate does not change between these replicas, so the most advantageous node for 1 replica, is likely going to be the most advantageous for many replicas given the same cluster state. Therefore all replicas were being scheduled for the same node in my testing.

      An easy (in theory) solution is to let AssignStrategy take a list of AssignRequests in assign(), and each request in this list will account for the replicaPlacements decided for the previous requests in the list. That way, the ReplaceNodeCmd can create it's list of AssignRequests, and issue them all at once to AssignStrategy, which will come up with the optimal plan for all replicas together.

      Because this is an API in assignStrategy, it will work with the new autoscaling APIs or using the legacy assign strategy.

      Attachments

        1. Screen Shot 2021-12-16 at 1.23.40 PM.png
          81 kB
          Houston Putman
        2. test-collection-creation-time.sh
          0.4 kB
          Houston Putman

        Issue Links

          Activity

            People

              houston Houston Putman
              houston Houston Putman
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h