Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-9241

Rebalance API for SolrCloud

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 6.1
    • 6.1
    • SolrCloud
    • Ubuntu, Mac OsX

    Description

      This is the v1 of the patch for Solrcloud Rebalance api (as described in http://engineering.bloomreach.com/solrcloud-rebalance-api/) , built at Bloomreach by Nitin Sharma and Suruchi Shah. The goal of the API is to provide a zero downtime mechanism to perform data manipulation and efficient core allocation in solrcloud. This API was envisioned to be the base layer that enables Solrcloud to be an auto scaling platform. (and work in unison with other complementing monitoring and scaling features).

      Patch Status:
      ===============
      The patch is work in progress and incremental. We have done a few rounds of code clean up. We wanted to get the patch going first to get initial feed back. We will continue to work on making it more open source friendly and easily testable.

      Deployment Status:
      ====================
      The platform is deployed in production at bloomreach and has been battle tested for large scale load. (millions of documents and hundreds of collections).

      Internals:
      =============
      The internals of the API and performance : http://engineering.bloomreach.com/solrcloud-rebalance-api/

      It is built on top of the admin collections API as an action (with various flavors). At a high level, the rebalance api provides 2 constructs:

      Scaling Strategy: Decides how to move the data. Every flavor has multiple options which can be reviewed in the api spec.
      Re-distribute - Move around data in the cluster based on capacity/allocation.
      Auto Shard - Dynamically shard a collection to any size.
      Smart Merge - Distributed Mode - Helps merging data from a larger shard setup into smaller one. (the source should be divisible by destination)
      Scale up - Add replicas on the fly
      Scale Down - Remove replicas on the fly

      Allocation Strategy: Decides where to put the data. (Nodes with least cores, Nodes that do not have this collection etc). Custom implementations can be built on top as well. One other example is Availability Zone aware. Distribute data such that every replica is placed on different availability zone to support HA.

      Detailed API Spec:
      ====================
      https://github.com/bloomreach/solrcloud-rebalance-api

      Contributors:
      =====================
      Nitin Sharma
      Suruchi Shah

      Questions/Comments:
      =====================
      You can reach me at nitinssn@gmail.com

      Attachments

        1. Replace_After.jpeg
          102 kB
          Nitin Sharma
        2. Replace_Before.jpeg
          112 kB
          Nitin Sharma
        3. Replace_Call.jpeg
          101 kB
          Nitin Sharma
        4. Redistribute_call.jpeg
          70 kB
          Nitin Sharma
        5. Redistribute_After.jpeg
          70 kB
          Nitin Sharma
        6. Redistribute_Before.jpeg
          67 kB
          Nitin Sharma
        7. SOLR-9241-6.1.patch
          70 kB
          Nitin Sharma
        8. SOLR-9241-4.6.patch
          184 kB
          Nitin Sharma

        Issue Links

          Activity

            People

              Unassigned Unassigned
              nitin.sharma Nitin Sharma
              Votes:
              16 Vote for this issue
              Watchers:
              28 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - 2,016h Original Estimate - 2,016h
                  2,016h
                  Remaining:
                  Remaining Estimate - 2,016h
                  2,016h
                  Logged:
                  Remaining Estimate - 2,016h
                  20m