Solr
  1. Solr
  2. SOLR-5890

Delete silently fails if not sent to shard where document was added

    Details

      Description

      We have SolrCloud set up with two shards, each with a leader and a replica. We use haproxy to distribute requests between the four nodes.

      Regardless of which node we send an add request to, following a commit, the newly-added document is returned in a search, as expected.

      However, we can only delete a document if the delete request is sent to a node in the shard where the document was added. If we send the delete request to a node in the other shard (and then send a commit) the document is not deleted. Such a delete request will get a 200 response, with the following body:

      {'responseHeader'=>{'status'=>0,'QTime'=>7}}

      Apart from the the very low QTime, this is indistinguishable from a successful delete.

      1. 5890_tests.patch
        11 kB
        Ishan Chattopadhyaya
      2. SOLR-5890.patch
        31 kB
        Ishan Chattopadhyaya
      3. SOLR-5890.patch
        31 kB
        Ishan Chattopadhyaya
      4. SOLR-5890.patch
        30 kB
        Noble Paul
      5. SOLR-5890.patch
        29 kB
        Ishan Chattopadhyaya
      6. SOLR-5890.patch
        29 kB
        Ishan Chattopadhyaya
      7. SOLR-5890.patch
        23 kB
        Ishan Chattopadhyaya
      8. SOLR-5890.patch
        15 kB
        Ishan Chattopadhyaya
      9. SOLR-5890.patch
        20 kB
        Ishan Chattopadhyaya
      10. SOLR-5890-without-broadcast.patch
        23 kB
        Ishan Chattopadhyaya
      11. SOLR-5980.patch
        15 kB
        Ishan Chattopadhyaya

        Issue Links

          Activity

          Hide
          Mark Miller added a comment -

          Very strange - the request should be forwarded. This will be interesting.

          Show
          Mark Miller added a comment - Very strange - the request should be forwarded. This will be interesting.
          Hide
          Mark Miller added a comment -

          If you look at the SolrAdmin cloud section, under the zk tree view, what router impl do you see in clusterstate.json? Is it implicit by any chance?

          Show
          Mark Miller added a comment - If you look at the SolrAdmin cloud section, under the zk tree view, what router impl do you see in clusterstate.json? Is it implicit by any chance?
          Hide
          Xavier Riley added a comment -

          Yes router is set to "implicit"

          Show
          Xavier Riley added a comment - Yes router is set to "implicit"
          Hide
          Brett Hoerner added a comment -

          I believe I have the same issue (using implicit also). Is there any way for me as the user to send the equivalent of "route" with a delete by ID? I have enough information to target the right shard, I'm just not sure how to "tell" it that.

          Show
          Brett Hoerner added a comment - I believe I have the same issue (using implicit also). Is there any way for me as the user to send the equivalent of " route " with a delete by ID? I have enough information to target the right shard, I'm just not sure how to "tell" it that.
          Hide
          Peter Inglesby added a comment -

          As Xavier (my colleague) said, the value of the "router" is "implicit".

          Is there anything else you'd like from us in the way of diagnostics?

          Show
          Peter Inglesby added a comment - As Xavier (my colleague) said, the value of the "router" is "implicit". Is there anything else you'd like from us in the way of diagnostics?
          Hide
          Mark Miller added a comment -

          That is why deletes won't distribute. With the implicit router, you take on handling updates yourself.

          Show
          Mark Miller added a comment - That is why deletes won't distribute. With the implicit router, you take on handling updates yourself.
          Hide
          Brett Hoerner added a comment -

          Couldn't (shouldn't) there be an equivalent to setting _route_ but for deletes, though? I'm happy to tell it where to go, and I'd like to avoid having to leave my handy CloudSolrServer and go to handling my own clients and sending every delete directly to specific servers. Or is there already a way to do these deletes using my single CloudSolrServer instance?

          Show
          Brett Hoerner added a comment - Couldn't (shouldn't) there be an equivalent to setting _ route _ but for deletes, though? I'm happy to tell it where to go, and I'd like to avoid having to leave my handy CloudSolrServer and go to handling my own clients and sending every delete directly to specific servers. Or is there already a way to do these deletes using my single CloudSolrServer instance?
          Hide
          Yonik Seeley added a comment -

          Couldn't (shouldn't) there be an equivalent to setting _route_ but for deletes, though?

          Yes, the intention of the "implicit" router was that the default route is defined by the receiving node. But one should also be able to provide a _route_ that is equal to the shard name. I haven't tried this to see if it still works, but I thought that it did at one time.

          Show
          Yonik Seeley added a comment - Couldn't (shouldn't) there be an equivalent to setting _route_ but for deletes, though? Yes, the intention of the "implicit" router was that the default route is defined by the receiving node. But one should also be able to provide a _route_ that is equal to the shard name. I haven't tried this to see if it still works, but I thought that it did at one time.
          Hide
          Brett Hoerner added a comment -

          Where can we provide a _route_ given the API of (new UpdateRequest).deleteById(1); ? Is there some other API I should be using for this?

          Show
          Brett Hoerner added a comment - Where can we provide a _ route _ given the API of (new UpdateRequest).deleteById(1); ? Is there some other API I should be using for this?
          Hide
          Mark Miller added a comment -

          You should be able to simply use UpdateRequest#setParam("route", "value").

          Feel free to open an issue and provide a patch for first class support

          Show
          Mark Miller added a comment - You should be able to simply use UpdateRequest#setParam(" route ", "value"). Feel free to open an issue and provide a patch for first class support
          Hide
          Brett Hoerner added a comment -

          Ah, I gotcha. I thought you meant I could set it per ID delete (vs at the request level), sort of like I can set _route_ inside of a single SolrDocument.

          But then I suppose that's what you mean by first class?

          Show
          Brett Hoerner added a comment - Ah, I gotcha. I thought you meant I could set it per ID delete (vs at the request level), sort of like I can set _ route _ inside of a single SolrDocument. But then I suppose that's what you mean by first class?
          Hide
          Peter Inglesby added a comment -

          Mark, can you point us to where the SolrCloud router configuration is documented?

          Show
          Peter Inglesby added a comment - Mark, can you point us to where the SolrCloud router configuration is documented?
          Show
          Brett Hoerner added a comment - See "Document Routing" here: https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud
          Hide
          Shawn Grant added a comment -

          I'm seeing the same problem on 4.6.1 using compositeId and a custom routing field.

          "router":

          { "field":"docHash", "name":"compositeId"}

          ,

          fyi, I'm wanting to use the docHash field for routing so that duplicate docs are on the same shard allowing for queries using grouping... or did I misread the Document Routing wiki page and configure that incorrectly?

          Show
          Shawn Grant added a comment - I'm seeing the same problem on 4.6.1 using compositeId and a custom routing field. "router": { "field":"docHash", "name":"compositeId"} , fyi, I'm wanting to use the docHash field for routing so that duplicate docs are on the same shard allowing for queries using grouping... or did I misread the Document Routing wiki page and configure that incorrectly?
          Hide
          Shalin Shekhar Mangar added a comment -

          But then I suppose that's what you mean by first class?

          I think Mark meant that we can add a SolrJ method to set _route_ on the request.

          Show
          Shalin Shekhar Mangar added a comment - But then I suppose that's what you mean by first class? I think Mark meant that we can add a SolrJ method to set _route_ on the request.
          Hide
          Brett Hoerner added a comment -

          I see. Would you agree it's a bit odd that I can add a different _route_ to 1000 documents and send them in a single UpdateRequest, but if I want to delete 1000 different documents by ID I need to set a top-level _route_ on the entire request (vs somehow setting it per delete)? That, to me, is the first class bit that's missing.

          I'm not asking someone to write it for me, btw. I just want to make sure my idea is sane here.

          Show
          Brett Hoerner added a comment - I see. Would you agree it's a bit odd that I can add a different _route_ to 1000 documents and send them in a single UpdateRequest, but if I want to delete 1000 different documents by ID I need to set a top-level _route_ on the entire request (vs somehow setting it per delete)? That, to me, is the first class bit that's missing. I'm not asking someone to write it for me, btw. I just want to make sure my idea is sane here.
          Hide
          Yonik Seeley added a comment -

          That, to me, is the first class bit that's missing.

          Yep, the ability to add routing info to any update (even in a batch) was planned from the start - but no one has gotten around to it yet.

          Show
          Yonik Seeley added a comment - That, to me, is the first class bit that's missing. Yep, the ability to add routing info to any update (even in a batch) was planned from the start - but no one has gotten around to it yet.
          Hide
          Mark Miller added a comment -

          Yes, I essentially meant all these things. First class solrj api support, first class UpdateRequest support, etc. For properties that can be per update or per doc, that means adding support like we currently have for version and commitWithin.

          The tricky part is compat - we use a map so it's easy to extend, but in the rolling update case, you could send these fine grained properties from new nodes and old nodes would ignore the new ones.

          Show
          Mark Miller added a comment - Yes, I essentially meant all these things. First class solrj api support, first class UpdateRequest support, etc. For properties that can be per update or per doc, that means adding support like we currently have for version and commitWithin. The tricky part is compat - we use a map so it's easy to extend, but in the rolling update case, you could send these fine grained properties from new nodes and old nodes would ignore the new ones.
          Hide
          Shawn Grant added a comment -

          I was able to work around the issue and get the delete command distributed by specifying the id as a query instead of directly by id.

          Show
          Shawn Grant added a comment - I was able to work around the issue and get the delete command distributed by specifying the id as a query instead of directly by id.
          Hide
          Uwe Schindler added a comment -

          Move issue to Solr 4.9.

          Show
          Uwe Schindler added a comment - Move issue to Solr 4.9.
          Hide
          José Joaquín added a comment -

          My delete command by ID on Solr 4.7.1 (implicit routing) works with commit=true even when it's not adressed to the shard where the document resides.
          But commitWithin only works when the command is adressed to the precise shard.

          Show
          José Joaquín added a comment - My delete command by ID on Solr 4.7.1 (implicit routing) works with commit=true even when it's not adressed to the shard where the document resides. But commitWithin only works when the command is adressed to the precise shard.
          Hide
          José Joaquín added a comment -

          Solr 4.10.1, a collection with only one shard and 2 replicas, router compositeId.

          /update?stream.body=<delete><id>1</id></delete>&commit=true
          works fine. The document is removed from both replicas.

          /update?stream.body=<delete><query>ID:1</query></delete>&commitWithin=x and
          /update?stream.body=<delete><id>1</id></delete>&commitWithin=x
          don't work. The document remains in one replica.

          Show
          José Joaquín added a comment - Solr 4.10.1, a collection with only one shard and 2 replicas, router compositeId. /update?stream.body=<delete><id>1</id></delete>&commit=true works fine. The document is removed from both replicas. /update?stream.body=<delete><query>ID:1</query></delete>&commitWithin=x and /update?stream.body=<delete><id>1</id></delete>&commitWithin=x don't work. The document remains in one replica.
          Hide
          Noble Paul added a comment -

          Implicit router can work 2 ways.

          1. Docs reside in the node where it is sent to
          2. Or use a route as the actual name of the shard

          Whenever it is not possible to identify the shard correctly (either using id or router.field ) the requests should be distributed across the cluster

          Show
          Noble Paul added a comment - Implicit router can work 2 ways. Docs reside in the node where it is sent to Or use a route as the actual name of the shard Whenever it is not possible to identify the shard correctly (either using id or router.field ) the requests should be distributed across the cluster
          Hide
          Ishan Chattopadhyaya added a comment -

          Here are my observations after some limited testing:

          Implicit router
          -----------------
          1. Without router.field defined, deleteByQuery.
          This worked for me, even without route, while sending the delete update request to a different shard.

          2. Without router.field defined, deleteById.
          This worked for me only once I send the route parameter with the correct shard name.

          3. With router.field defined, deleteByQuery.
          This worked for me, even without route, while sending the delete update request to a different shard.

          4. With router.field defined, deleteById.
          This worked for me with route parameter, while sending the delete update request to a different shard.

          Show
          Ishan Chattopadhyaya added a comment - Here are my observations after some limited testing: Implicit router ----------------- 1. Without router.field defined, deleteByQuery. This worked for me, even without route , while sending the delete update request to a different shard. 2. Without router.field defined, deleteById. This worked for me only once I send the route parameter with the correct shard name. 3. With router.field defined, deleteByQuery. This worked for me, even without route , while sending the delete update request to a different shard. 4. With router.field defined, deleteById. This worked for me with route parameter, while sending the delete update request to a different shard.
          Hide
          Ishan Chattopadhyaya added a comment - - edited

          For supporting route parameter per id in a deleteById request, instead of a route parameter per request, I propose the following changes to the request format.

          JSON:
          "delete":

          { "id":"ID" , "_route_":"route"}

          XML:
          <delete>
          <id "route"="shard1">123</id>
          <id "route"="shard2">124</id>
          </delete>

          Also, like
          UpdateRequest.deleteById(String id),
          I propose another method:
          UpdateRequest.deleteById(String id, String route)

          Can someone please review these suggestions?

          Show
          Ishan Chattopadhyaya added a comment - - edited For supporting route parameter per id in a deleteById request, instead of a route parameter per request, I propose the following changes to the request format. JSON: "delete": { "id":"ID" , "_route_":"route"} XML: <delete> <id " route "="shard1">123</id> <id " route "="shard2">124</id> </delete> Also, like UpdateRequest.deleteById(String id), I propose another method: UpdateRequest.deleteById(String id, String route) Can someone please review these suggestions?
          Hide
          Ishan Chattopadhyaya added a comment - - edited

          > Whenever it is not possible to identify the shard correctly (either using id or router.field ) the requests should be distributed across the cluster

          Noble Paul For a deleteById (implicit or compositeId router) without a _route_ parameter, do you think it is reasonable to distribute the request across the cluster always?

          Show
          Ishan Chattopadhyaya added a comment - - edited > Whenever it is not possible to identify the shard correctly (either using id or router.field ) the requests should be distributed across the cluster Noble Paul For a deleteById (implicit or compositeId router) without a _ route _ parameter, do you think it is reasonable to distribute the request across the cluster always?
          Hide
          Shalin Shekhar Mangar added a comment -

          2. Without router.field defined, deleteById.
          This worked for me only once I send the route parameter with the correct shard name.

          Are you sure? Because no router.field means that unique key is being used for routing and DistributedUpdateProcessor#processDelete uses the indexed id to hash and find the right shard. I wrote a quick test to confirm this.

          Show
          Shalin Shekhar Mangar added a comment - 2. Without router.field defined, deleteById. This worked for me only once I send the route parameter with the correct shard name. Are you sure? Because no router.field means that unique key is being used for routing and DistributedUpdateProcessor#processDelete uses the indexed id to hash and find the right shard. I wrote a quick test to confirm this.
          Hide
          Ishan Chattopadhyaya added a comment - - edited

          Shalin Shekhar Mangar Yes, I tested it (patch attached for the test). Here, as far as I understand, since the router is implicit, there is no hashing of the id performed to find the right shard; instead the deleteById is performed on the shard to which the update request is sent to (unless a specific route param is present). Maybe I'm missing something?

          Show
          Ishan Chattopadhyaya added a comment - - edited Shalin Shekhar Mangar Yes, I tested it (patch attached for the test). Here, as far as I understand, since the router is implicit, there is no hashing of the id performed to find the right shard; instead the deleteById is performed on the shard to which the update request is sent to (unless a specific route param is present). Maybe I'm missing something?
          Hide
          Shalin Shekhar Mangar added a comment -

          Ah I missed the implicit router part. Yes, that makes sense.

          Show
          Shalin Shekhar Mangar added a comment - Ah I missed the implicit router part. Yes, that makes sense.
          Hide
          Shalin Shekhar Mangar added a comment -

          Whenever it is not possible to identify the shard correctly (either using id or router.field ) the requests should be distributed across the cluster

          We should also support route parameter for updates. So if a route is specified then that should be used to send the update to the correct node. If no route is specified and the router is implicit, we should send the request across the cluster.

          As an extension to this issue, we should add first-class support for route for delete-by-id in the request formats. But that can be done as another issue.

          Show
          Shalin Shekhar Mangar added a comment - Whenever it is not possible to identify the shard correctly (either using id or router.field ) the requests should be distributed across the cluster We should also support route parameter for updates. So if a route is specified then that should be used to send the update to the correct node. If no route is specified and the router is implicit, we should send the request across the cluster. As an extension to this issue, we should add first-class support for route for delete-by-id in the request formats. But that can be done as another issue.
          Hide
          Ishan Chattopadhyaya added a comment -

          Attached patch that has the following fix:
          1. deleteById command now has a _route_ parameter.
          2. commitWithin wasn't working (at least with the implicit router). Added a fix in SolrCmdDistributor.

          TODO:
          Absent _route_ parameter should distribute the deleteById command to all shard leaders with implicit router.

          Show
          Ishan Chattopadhyaya added a comment - Attached patch that has the following fix: 1. deleteById command now has a _ route _ parameter. 2. commitWithin wasn't working (at least with the implicit router). Added a fix in SolrCmdDistributor. TODO: Absent _ route _ parameter should distribute the deleteById command to all shard leaders with implicit router.
          Hide
          Ishan Chattopadhyaya added a comment - - edited

          Updated the patch with the above TODO item. This patch now has:

          1. deleteById command now has a route parameter.
          2. commitWithin wasn't working (at least with the implicit router). Added a fix in SolrCmdDistributor.
          3. deleteById command (implicit router) should be distributed to all shard leaders if no route parameter is present.

          Noble Paul, Shalin Shekhar Mangar, please review (review request https://reviews.apache.org/r/29561/).

          Show
          Ishan Chattopadhyaya added a comment - - edited Updated the patch with the above TODO item. This patch now has: 1. deleteById command now has a route parameter. 2. commitWithin wasn't working (at least with the implicit router). Added a fix in SolrCmdDistributor. 3. deleteById command (implicit router) should be distributed to all shard leaders if no route parameter is present. Noble Paul , Shalin Shekhar Mangar , please review (review request https://reviews.apache.org/r/29561/ ).
          Hide
          Noble Paul added a comment -

          The logic of identifying the slice should not be in DistributedUpdateProcessor . Lets move it to the router implementation and keep DistributedUpdateProcessor totally agnostic of that part .

          Please add support to JSONLoader for this format

          Show
          Noble Paul added a comment - The logic of identifying the slice should not be in DistributedUpdateProcessor . Lets move it to the router implementation and keep DistributedUpdateProcessor totally agnostic of that part . Please add support to JSONLoader for this format
          Hide
          Ishan Chattopadhyaya added a comment -

          Thanks for your review, Noble. I've made the two changes and updated the patch.

          Show
          Ishan Chattopadhyaya added a comment - Thanks for your review, Noble. I've made the two changes and updated the patch.
          Hide
          Yonik Seeley added a comment -

          deleteById command (implicit router) should be distributed to all shard leaders if no route parameter is present.

          I'm not sure we should do this by default...

          • Previously working systems will now work differently (hopefully just slow down)
          • It doesn't alleviate the fundamental issue that clients need to know where documents live with the _implicit_ router... If one updates a document, they need to know the correct shard to send it to or else it will cause a duplicate in the system.

          One thing to consider is to have some sort of strict mode... or even a different router, that requires _route_ to be set for all adds/deletes (and can optionally broadcast deletes).

          Show
          Yonik Seeley added a comment - deleteById command (implicit router) should be distributed to all shard leaders if no route parameter is present. I'm not sure we should do this by default... Previously working systems will now work differently (hopefully just slow down) It doesn't alleviate the fundamental issue that clients need to know where documents live with the _implicit_ router... If one updates a document, they need to know the correct shard to send it to or else it will cause a duplicate in the system. One thing to consider is to have some sort of strict mode... or even a different router, that requires _route_ to be set for all adds/deletes (and can optionally broadcast deletes).
          Hide
          Ishan Chattopadhyaya added a comment - - edited

          Removed the broadcast deleteById commands (without _route_ parameter, implicit router) and added a new patch. We can discuss the need for broadcast of deleteById for implicit router (without _route_) in another issue (SOLR-6910).

          Show
          Ishan Chattopadhyaya added a comment - - edited Removed the broadcast deleteById commands (without _ route _ parameter, implicit router) and added a new patch. We can discuss the need for broadcast of deleteById for implicit router (without _ route _) in another issue ( SOLR-6910 ).
          Hide
          Noble Paul added a comment -

          This does not take care of the case when hash based routing is not done on the uniqueKey field

          Show
          Noble Paul added a comment - This does not take care of the case when hash based routing is not done on the uniqueKey field
          Hide
          Ishan Chattopadhyaya added a comment -

          Ah, indeed! I missed that part. I'll update the patch.

          Show
          Ishan Chattopadhyaya added a comment - Ah, indeed! I missed that part. I'll update the patch.
          Hide
          Ishan Chattopadhyaya added a comment -

          Updated the patch, now with the Hash based router also honouring the _route_ param.

          1. deleteById command now has a _route_ parameter. In implicit router, the target shard can be specified. In compositeId router, the route parameter is hashed to obtain the target slice (useful for collections that use router.field).
          2. commitWithin wasn't working. Added a fix in SolrCmdDistributor.

          Show
          Ishan Chattopadhyaya added a comment - Updated the patch, now with the Hash based router also honouring the _ route _ param. 1. deleteById command now has a _ route _ parameter. In implicit router, the target shard can be specified. In compositeId router, the route parameter is hashed to obtain the target slice (useful for collections that use router.field). 2. commitWithin wasn't working. Added a fix in SolrCmdDistributor.
          Hide
          Mark Miller added a comment -

          High level, patch looks good, but adds some bad formatting:

          + if(

          If statements should have a space before (

          +  public String getRoute() {
          +    return route;
          +  }
          +  public void setRoute(String route) {
          +    this.route = route;
          +  }
          

          Should be a newline before public.

          Show
          Mark Miller added a comment - High level, patch looks good, but adds some bad formatting: + if( If statements should have a space before ( + public String getRoute() { + return route; + } + public void setRoute( String route) { + this .route = route; + } Should be a newline before public.
          Hide
          Ishan Chattopadhyaya added a comment -

          Mark Miller, thanks for your review. I've updated the patch with the formatting fixes.

          Show
          Ishan Chattopadhyaya added a comment - Mark Miller , thanks for your review. I've updated the patch with the formatting fixes.
          Hide
          Noble Paul added a comment -

          with formatting corrected

          Show
          Noble Paul added a comment - with formatting corrected
          Hide
          Ishan Chattopadhyaya added a comment -

          Updated the patch with an added test for the XmlLoader changes.

          Show
          Ishan Chattopadhyaya added a comment - Updated the patch with an added test for the XmlLoader changes.
          Hide
          ASF subversion and git services added a comment -

          Commit 1658486 from Noble Paul in branch 'dev/trunk'
          [ https://svn.apache.org/r1658486 ]

          SOLR-5890: Delete silently fails if not sent to shard where document was
          added

          Show
          ASF subversion and git services added a comment - Commit 1658486 from Noble Paul in branch 'dev/trunk' [ https://svn.apache.org/r1658486 ] SOLR-5890 : Delete silently fails if not sent to shard where document was added
          Hide
          ASF subversion and git services added a comment -

          Commit 1658549 from Noble Paul in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1658549 ]

          SOLR-5890: Delete silently fails if not sent to shard where document was added

          Show
          ASF subversion and git services added a comment - Commit 1658549 from Noble Paul in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1658549 ] SOLR-5890 : Delete silently fails if not sent to shard where document was added
          Hide
          Timothy Potter added a comment -

          Bulk close after 5.1 release

          Show
          Timothy Potter added a comment - Bulk close after 5.1 release

            People

            • Assignee:
              Noble Paul
              Reporter:
              Peter Inglesby
            • Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development