Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-15052

Reducing overseer bottlenecks using per-replica states

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 8.8
    • None
    • None

    Description

      This work has the same goal as SOLR-13951, that is to reduce overseer bottlenecks by avoiding replica state updates from going to the state.json via the overseer. However, the approach taken here is different from SOLR-13951 and hence this work supercedes that work.

      The design proposed is here: https://docs.google.com/document/d/1xdxpzUNmTZbk0vTMZqfen9R3ArdHokLITdiISBxCFUg/edit

      Briefly,

      1. Every replica's state will be in a separate znode nested under the state.json. It has the name that encodes the replica name, state, leadership status.
      2. An additional children watcher to be set on state.json for state changes.
      3. Upon a state change, a ZK multi-op to delete the previous znode and add a new znode with new state.

      Differences between this and SOLR-13951,

      1. In SOLR-13951, we planned to leverage shard terms for per shard states.
      2. As a consequence, the code changes required for SOLR-13951 were massive (we needed a shard state provider abstraction and introduce it everywhere in the codebase).
      3. This approach is a drastically simpler change and design.

      Credits for this design and the PR is due to noble.paul. markrmiller@gmail.com, noble.paul and I have collaborated on this effort. The reference branch takes a conceptually similar (but not identical) approach.

      I shall attach a PR and performance benchmarks shortly.

      Attachments

        1. per-replica-states-gcp.pdf
          182 kB
          Ishan Chattopadhyaya
        2. collection-creation.png
          74 kB
          Mike Drob

        Issue Links

          Activity

            People

              noble.paul Noble Paul
              ichattopadhyaya Ishan Chattopadhyaya
              Votes:
              2 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10.5h
                  10.5h