Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: 4.0-ALPHA
    • Component/s: SolrCloud, update
    • Labels:
      None

      Description

      Here's a simple update processor for distributed indexing that I implemented years ago.
      It implements a simple hash(id) MOD nservers and just fails if any servers are down.
      Given the recent activity in distributed indexing, I thought this might be at least a good source for ideas.

      1. TestDistributedUpdate.java
        9 kB
        Yonik Seeley
      2. DistributedUpdateProcessorFactory.java
        12 kB
        Yonik Seeley

        Issue Links

          Activity

          Hide
          Shalin Shekhar Mangar added a comment -

          This was fixed as part of SOLR-2358 in the initial release of SolrCloud.

          Show
          Shalin Shekhar Mangar added a comment - This was fixed as part of SOLR-2358 in the initial release of SolrCloud.
          Hide
          Steve Rowe added a comment -

          Bulk move 4.4 issues to 4.5 and 5.0

          Show
          Steve Rowe added a comment - Bulk move 4.4 issues to 4.5 and 5.0
          Hide
          Hoss Man added a comment -

          Bulk of fixVersion=3.6 -> fixVersion=4.0 for issues that have no assignee and have not been updated recently.

          email notification suppressed to prevent mass-spam
          psuedo-unique token identifying these issues: hoss20120321nofix36

          Show
          Hoss Man added a comment - Bulk of fixVersion=3.6 -> fixVersion=4.0 for issues that have no assignee and have not been updated recently. email notification suppressed to prevent mass-spam psuedo-unique token identifying these issues: hoss20120321nofix36
          Hide
          Otis Gospodnetic added a comment -

          Yonik - this code has Copyright Lucid 2008 notice in it...

          Are the attached classes from February the latest code one should work from?

          Show
          Otis Gospodnetic added a comment - Yonik - this code has Copyright Lucid 2008 notice in it... Are the attached classes from February the latest code one should work from?
          Hide
          Patrick Sauts added a comment -

          Seems not to work with Solr 3.4
          because of
          new AddUpdateCommand(req); line 181
          new DeleteUpdateCommand(req); line 199

          Show
          Patrick Sauts added a comment - Seems not to work with Solr 3.4 because of new AddUpdateCommand(req); line 181 new DeleteUpdateCommand(req); line 199
          Hide
          Robert Muir added a comment -

          3.4 -> 3.5

          Show
          Robert Muir added a comment - 3.4 -> 3.5
          Hide
          Mark Miller added a comment -

          This class does not currently properly propagate commithWithin and overwrite settings from AddUpdateCommands. I've got some code to address this that we should probably incorporate before committing. The standard UpdateRequest class that is used does not allow for fine grained commitWithin (only per request, not per document like you can do with an AddUpdateCommand), or any overwrite. Also, these settings are not cloned when cloning the command. By adding those properties to the clone, and using a new UpdateRequest that has an add(SolrDoc, commitWithin, overwrite), this can be addressed.

          Show
          Mark Miller added a comment - This class does not currently properly propagate commithWithin and overwrite settings from AddUpdateCommands. I've got some code to address this that we should probably incorporate before committing. The standard UpdateRequest class that is used does not allow for fine grained commitWithin (only per request, not per document like you can do with an AddUpdateCommand), or any overwrite. Also, these settings are not cloned when cloning the command. By adding those properties to the clone, and using a new UpdateRequest that has an add(SolrDoc, commitWithin, overwrite), this can be addressed.
          Hide
          Robert Muir added a comment -

          Bulk move 3.2 -> 3.3

          Show
          Robert Muir added a comment - Bulk move 3.2 -> 3.3
          Hide
          Mark Miller added a comment -

          One thing we should probably address is the brittle cmd cloning. I don't like clone methods in general - but if we are going to do it in core code, better to put the clone in the cmd and be a bit less brittle.

          Show
          Mark Miller added a comment - One thing we should probably address is the brittle cmd cloning. I don't like clone methods in general - but if we are going to do it in core code, better to put the clone in the cmd and be a bit less brittle.
          Hide
          Yonik Seeley added a comment -

          Some sample configuration:

            <updateRequestProcessorChain name="distrib">
              <processor class="com.lucid.update.DistributedUpdateProcessorFactory">
                <!-- example configuration...
                 "shards should be in the *same* order for every server
                  in a cluster.  Only "self" should change to represent
                  what server *this* is.
          
                <str name="self">localhost:8983/solr</str>
                <arr name="shards">
                  <str>localhost:8983/solr</str>
                  <str>localhost:7574/solr</str>
                </arr>
                -->
              </processor>
              <processor class="solr.LogUpdateProcessorFactory">
                <int name="maxNumToLog">10</int>
              </processor>
              <processor class="solr.RunUpdateProcessorFactory"/>
            </updateRequestProcessorChain>
          

          Now on any update command, you can set update.processor=distrib and have distrib indexing controlled by the "shards" and "self" params, either configured in solrconfig, or passed in w/ the update command.

          Or if you don't want to have to specify update.processor=distrib, you can set it as the default update processor for any update request handlers:

            <!-- CSV update handler, loaded on demand -->
            <requestHandler class="solr.CSVRequestHandler" name="/update/csv" startup="lazy">
              <lst name="defaults">
                <str name="update.processor">distrib</str>
              </lst>
            </requestHandler>
          
          Show
          Yonik Seeley added a comment - Some sample configuration: <updateRequestProcessorChain name= "distrib" > <processor class= "com.lucid.update.DistributedUpdateProcessorFactory" > <!-- example configuration... "shards should be in the *same* order for every server in a cluster. Only "self" should change to represent what server * this * is. <str name= "self" >localhost:8983/solr</str> <arr name= "shards" > <str>localhost:8983/solr</str> <str>localhost:7574/solr</str> </arr> --> </processor> <processor class= "solr.LogUpdateProcessorFactory" > < int name= "maxNumToLog" >10</ int > </processor> <processor class= "solr.RunUpdateProcessorFactory" /> </updateRequestProcessorChain> Now on any update command, you can set update.processor=distrib and have distrib indexing controlled by the "shards" and "self" params, either configured in solrconfig, or passed in w/ the update command. Or if you don't want to have to specify update.processor=distrib, you can set it as the default update processor for any update request handlers: <!-- CSV update handler, loaded on demand --> <requestHandler class= "solr.CSVRequestHandler" name= "/update/csv" startup= "lazy" > <lst name= "defaults" > <str name= "update.processor" >distrib</str> </lst> </requestHandler>
          Hide
          Yonik Seeley added a comment -

          Here's the processor class and the test class (not in patch form, I just pulled these files straight from our commercial product).

          Show
          Yonik Seeley added a comment - Here's the processor class and the test class (not in patch form, I just pulled these files straight from our commercial product).

            People

            • Assignee:
              Unassigned
              Reporter:
              Yonik Seeley
            • Votes:
              4 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development