Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-12509

Improve SplitShardCmd performance and reliability

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 7.5
    • Component/s: SolrCloud
    • Labels:
      None

      Description

      SplitShardCmd is currently quite complex.

      Shard splitting occurs on active shards, which are still being updated, so the splitting has to involve several carefully orchestrated steps, making sure that new sub-shard placeholders are properly created and visible, and then also applying buffered updates to the split leaders and performing recovery on sub-shard replicas.

      This process could be simplified in cases where collections are not actively being updated or can tolerate a little downtime - we could put the shard "offline", ie. disable writing while the splitting is in progress (in order to avoid users' confusion we should disable writing to the whole collection).

      The actual index splittingĀ couldĀ perhaps be improved to use HardLinkCopyDirectoryWrapper for creating a copy of the index by hard-linking existing index segments, and then applying deletes to the documents that don't belong in a sub-shard. However, the resulting index slices that replicas would have to pull would be the same size as the whole shard.

        Attachments

        1. SOLR-12509.patch
          86 kB
          Andrzej Bialecki
        2. SOLR-12509.patch
          85 kB
          Andrzej Bialecki

          Activity

            People

            • Assignee:
              ab Andrzej Bialecki
              Reporter:
              ab Andrzej Bialecki
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: