Solr
  1. Solr
  2. SOLR-6761

Ability to ignore commit and optimize requests from clients when running in SolrCloud mode.

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.0, 6.0
    • Component/s: SolrCloud, SolrJ
    • Labels:
      None

      Description

      In most SolrCloud environments, it's advisable to only rely on auto-commits (soft and hard) configured in solrconfig.xml and not send explicit commit requests from client applications. In fact, I've seen cases where improperly coded client applications can send commit requests too frequently, which can lead to harming the cluster's health.

      As a system administrator, I'd like the ability to disallow commit requests from client applications. Ideally, I could configure the updateHandler to ignore the requests and return an HTTP response code of my choosing as I may not want to break existing client applications by returning an error. In other words, I may want to just return 200 vs. 405. The same goes for optimize requests.

      1. SOLR-6761.patch
        12 kB
        Timothy Potter
      2. SOLR-6761.patch
        5 kB
        Timothy Potter

        Activity

        Hide
        Yonik Seeley added a comment -

        How about even more general: a minimum commitWithin and the ability to downgrade an immediate commit or softCommit to a soft commitWithin.
        Perhaps a special value of -1 could mean disallow / "don't actually do it" .

        So minCommitWithin=5000
        would convert an incoming commit to commitWithin=5000
        and would convert commitWithin=10 to commitWithin=5000

        Show
        Yonik Seeley added a comment - How about even more general: a minimum commitWithin and the ability to downgrade an immediate commit or softCommit to a soft commitWithin. Perhaps a special value of -1 could mean disallow / "don't actually do it" . So minCommitWithin=5000 would convert an incoming commit to commitWithin=5000 and would convert commitWithin=10 to commitWithin=5000
        Hide
        Ramkumar Aiyengar added a comment -

        I like the idea, with the minor exception that it sounds wrong to return 200 instead of a 4xx. The client is doing some effort to add the commit request and should know that it's not been respected. If it breaks them, so be it, they are doing something the system is not configured to do. They might actually even rely on the assumption that once the commit is done it's immediately available for search..

        Show
        Ramkumar Aiyengar added a comment - I like the idea, with the minor exception that it sounds wrong to return 200 instead of a 4xx. The client is doing some effort to add the commit request and should know that it's not been respected. If it breaks them, so be it, they are doing something the system is not configured to do. They might actually even rely on the assumption that once the commit is done it's immediately available for search..
        Hide
        Mark Miller added a comment -

        +1, good idea Tim. I think it can make sense to use commitWithin from the client exclusively with SolrCloud, but only when a knowledgeable/expert person/team owns the service. That is very often not the case due to a variety of reasons in my experience. Solr is often deployed in situations where an administrator needs to protect the service from a variety of users with varying expertise.

        I agree with Ram though - I think it makes more sense to make sure the client knows it cannot call commit and adjusts behavior. We just need a useful error message.

        Show
        Mark Miller added a comment - +1, good idea Tim. I think it can make sense to use commitWithin from the client exclusively with SolrCloud, but only when a knowledgeable/expert person/team owns the service. That is very often not the case due to a variety of reasons in my experience. Solr is often deployed in situations where an administrator needs to protect the service from a variety of users with varying expertise. I agree with Ram though - I think it makes more sense to make sure the client knows it cannot call commit and adjusts behavior. We just need a useful error message.
        Hide
        Mark Miller added a comment -

        I don't see why silent fail couldn't be a config option though. There probably are Solr administrators that would like to try and address this and not break all it's clients. It's fairly dangerous if any clients where counting on that behavior though. I think it should come with a big fat warning at least.

        Show
        Mark Miller added a comment - I don't see why silent fail couldn't be a config option though. There probably are Solr administrators that would like to try and address this and not break all it's clients. It's fairly dangerous if any clients where counting on that behavior though. I think it should come with a big fat warning at least.
        Hide
        Hoss Man added a comment -

        This would be fairly easy to implement as an UpdateProcessor, which would also give you an easy way to enable/configure it (I thought we already had an open issue for that, but i may just be thinking of of the issue about killing Optimize)

        Show
        Hoss Man added a comment - This would be fairly easy to implement as an UpdateProcessor, which would also give you an easy way to enable/configure it (I thought we already had an open issue for that, but i may just be thinking of of the issue about killing Optimize)
        Hide
        Timothy Potter added a comment -

        Here's a patch that shows the custom UpdateRequestProcessor approach suggested by Hoss Man. The only concern I have is that it needs to be wired into solrconfig.xml. Going with the idea that this feature is for a system administrator, it might make more sense to set this at a more global level, esp. if admins give the ability for other groups to upload their own custom configs and create their own collections. So I'm thinking maybe just a system property (or solr.xml level property) that can be set that affects the DistributedUpateRequestProcessor?

        Show
        Timothy Potter added a comment - Here's a patch that shows the custom UpdateRequestProcessor approach suggested by Hoss Man . The only concern I have is that it needs to be wired into solrconfig.xml. Going with the idea that this feature is for a system administrator, it might make more sense to set this at a more global level, esp. if admins give the ability for other groups to upload their own custom configs and create their own collections. So I'm thinking maybe just a system property (or solr.xml level property) that can be set that affects the DistributedUpateRequestProcessor?
        Hide
        Hoss Man added a comment -

        i don't see how it's really any differnet then anything else we expect "solr admins" to edit in their solrconfig.xml (like enableStreaming, whether the /update should have any defaults on it, etc...)

        but if you really want it to be sysprop driven, that can still be done via the enable property...

        <processor class="solrIgnoreCommitUpdateProcessorFactory" enable="${solr.ignore.explicit.commit:false}">
          ...
        </processor>
        
        Show
        Hoss Man added a comment - i don't see how it's really any differnet then anything else we expect "solr admins" to edit in their solrconfig.xml (like enableStreaming, whether the /update should have any defaults on it, etc...) but if you really want it to be sysprop driven, that can still be done via the enable property... <processor class= "solrIgnoreCommitUpdateProcessorFactory" enable= "${solr.ignore.explicit.commit: false }" > ... </processor>
        Hide
        Erick Erickson added a comment - - edited

        Does Noble's work with an API to alter solrconfig.xml apply at all here?

        Show
        Erick Erickson added a comment - - edited Does Noble's work with an API to alter solrconfig.xml apply at all here?
        Hide
        Timothy Potter added a comment -

        Moving forward with the updateRequestProcessor approach. Latest patch includes a unit test. To activate this request processor you'll need to add something like the following to your solrconfig.xml:

          <updateRequestProcessorChain name="ignore-commit-from-client" default="true">
            <processor class="solr.IgnoreCommitOptimizeUpdateProcessorFactory">
              <int name="statusCode">200</int>
            </processor>
            <processor class="solr.LogUpdateProcessorFactory" />
            <processor class="solr.DistributedUpdateProcessorFactory" />
            <processor class="solr.RunUpdateProcessorFactory" />
          </updateRequestProcessorChain>
        

        As shown in the example above, the processor will return 200 to the client but will ignore the commit / optimize request. Notice that you need to wire-in the implicit processors needed by SolrCloud as well since this custom chain is taking the place of the default chain.

        In the following example, the processor will raise an exception with a 403 code with a customized error message:

          <updateRequestProcessorChain name="ignore-commit-from-client" default="true">
            <processor class="solr.IgnoreCommitOptimizeUpdateProcessorFactory">
              <int name="statusCode">403</int>
              <str name="responseMessage">Thou shall not issue a commit!</str>
            </processor>
            <processor class="solr.LogUpdateProcessorFactory" />
            <processor class="solr.DistributedUpdateProcessorFactory" />
            <processor class="solr.RunUpdateProcessorFactory" />
          </updateRequestProcessorChain>
        

        Lastly, you can also configure it to just ignore optimize and let commits pass thru by doing:

          <updateRequestProcessorChain name="ignore-optimize-only-from-client-403">
            <processor class="solr.IgnoreCommitOptimizeUpdateProcessorFactory">
              <str name="responseMessage">Thou shall not issue an optimize, but commits are OK!</str>
              <bool name="ignoreOptimizeOnly">true</bool>
            </processor>
            <processor class="solr.RunUpdateProcessorFactory" />
          </updateRequestProcessorChain>
        

        One idea I had for making this easier to turn on globally would be to wire it into the implicit chain definition (in SolrCore). The patch doesn't do this yet, but in SolrCore, when the implicit chain is setup, we could enable this if the node is in SolrCloud mode and a system property (solr.ignoreCommitOptimizeFromClients=both|optimize) is set.

        Show
        Timothy Potter added a comment - Moving forward with the updateRequestProcessor approach. Latest patch includes a unit test. To activate this request processor you'll need to add something like the following to your solrconfig.xml: <updateRequestProcessorChain name= "ignore-commit-from-client" default = " true " > <processor class= "solr.IgnoreCommitOptimizeUpdateProcessorFactory" > < int name= "statusCode" >200</ int > </processor> <processor class= "solr.LogUpdateProcessorFactory" /> <processor class= "solr.DistributedUpdateProcessorFactory" /> <processor class= "solr.RunUpdateProcessorFactory" /> </updateRequestProcessorChain> As shown in the example above, the processor will return 200 to the client but will ignore the commit / optimize request. Notice that you need to wire-in the implicit processors needed by SolrCloud as well since this custom chain is taking the place of the default chain. In the following example, the processor will raise an exception with a 403 code with a customized error message: <updateRequestProcessorChain name= "ignore-commit-from-client" default = " true " > <processor class= "solr.IgnoreCommitOptimizeUpdateProcessorFactory" > < int name= "statusCode" >403</ int > <str name= "responseMessage" >Thou shall not issue a commit!</str> </processor> <processor class= "solr.LogUpdateProcessorFactory" /> <processor class= "solr.DistributedUpdateProcessorFactory" /> <processor class= "solr.RunUpdateProcessorFactory" /> </updateRequestProcessorChain> Lastly, you can also configure it to just ignore optimize and let commits pass thru by doing: <updateRequestProcessorChain name= "ignore-optimize-only-from-client-403" > <processor class= "solr.IgnoreCommitOptimizeUpdateProcessorFactory" > <str name= "responseMessage" >Thou shall not issue an optimize, but commits are OK!</str> <bool name= "ignoreOptimizeOnly" > true </bool> </processor> <processor class= "solr.RunUpdateProcessorFactory" /> </updateRequestProcessorChain> One idea I had for making this easier to turn on globally would be to wire it into the implicit chain definition (in SolrCore). The patch doesn't do this yet, but in SolrCore, when the implicit chain is setup, we could enable this if the node is in SolrCloud mode and a system property (solr.ignoreCommitOptimizeFromClients=both|optimize) is set.
        Hide
        ASF subversion and git services added a comment -

        Commit 1648775 from Timothy Potter in branch 'dev/trunk'
        [ https://svn.apache.org/r1648775 ]

        SOLR-6761: Ability to ignore commit and optimize requests from clients when running in SolrCloud mode.

        Show
        ASF subversion and git services added a comment - Commit 1648775 from Timothy Potter in branch 'dev/trunk' [ https://svn.apache.org/r1648775 ] SOLR-6761 : Ability to ignore commit and optimize requests from clients when running in SolrCloud mode.
        Hide
        ASF subversion and git services added a comment -

        Commit 1650097 from Timothy Potter in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1650097 ]

        SOLR-6761: Ability to ignore commit and optimize requests from clients when running in SolrCloud mode.

        Show
        ASF subversion and git services added a comment - Commit 1650097 from Timothy Potter in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1650097 ] SOLR-6761 : Ability to ignore commit and optimize requests from clients when running in SolrCloud mode.
        Hide
        Alexandre Rafalovitch added a comment -

        Just to clarify, the implementation itself does not care whether this is Cloud mode or not. You are leaving that for the sysadmin to set with the enable property, right?

        So, one could wire it up in a standalone mode, if they wanted to. Nothing prevents them. If so, maybe the description (in Readme) should say that it allows rejecting commits/optimize and something like "primarily useful for SolrCloud mode".

        Show
        Alexandre Rafalovitch added a comment - Just to clarify, the implementation itself does not care whether this is Cloud mode or not. You are leaving that for the sysadmin to set with the enable property, right? So, one could wire it up in a standalone mode, if they wanted to. Nothing prevents them. If so, maybe the description (in Readme) should say that it allows rejecting commits/optimize and something like "primarily useful for SolrCloud mode".
        Hide
        Anshum Gupta added a comment -

        Bulk close after 5.0 release.

        Show
        Anshum Gupta added a comment - Bulk close after 5.0 release.

          People

          • Assignee:
            Timothy Potter
            Reporter:
            Timothy Potter
          • Votes:
            1 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development