Solr
  1. Solr
  2. SOLR-6035

CloudSolrServer directUpdate routing should use getCoreUrl

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.8.1
    • Component/s: SolrCloud
    • Labels:

      Description

      In a multisharded node environment we were seeing forward-to-leader hops when using CloudSolrServer directUpdate (with the hop being on the same node) . Consequently, there was no improvement in indexing performance over non-directUpdate.

      Changing buildUrlMap to use getCoreUrl eliminated the extra hop and now we see a dramatic improvement in performance.

      1. SOLR-6035.patch
        1 kB
        Joel Bernstein
      2. SOLR-6035.patch
        1 kB
        Marvin Justice

        Activity

        Hide
        Marvin Justice added a comment -

        Uploading patch against branch_4x

        Show
        Marvin Justice added a comment - Uploading patch against branch_4x
        Hide
        Mark Miller added a comment -

        Interesting - that may explain some earlier reports.

        Show
        Mark Miller added a comment - Interesting - that may explain some earlier reports.
        Hide
        Joel Bernstein added a comment -

        This is good, thanks Marvin.

        Show
        Joel Bernstein added a comment - This is good, thanks Marvin.
        Hide
        Erick Erickson added a comment -

        Marvin:

        Can you quantify "dramatic"? Just off the top of your head, I'm curious how big an improvement you're seeing....

        Thanks...

        Show
        Erick Erickson added a comment - Marvin: Can you quantify "dramatic"? Just off the top of your head, I'm curious how big an improvement you're seeing.... Thanks...
        Hide
        Joel Bernstein added a comment -

        I should have some time in the next couple of days to test out the patch. One of things we'll want to be sure of is that there aren't any issues caused by going directly to coreUrl rather then through the collection Url. For exampe, does replication within the shard still happen if you go directly to the coreUrl.

        Show
        Joel Bernstein added a comment - I should have some time in the next couple of days to test out the patch. One of things we'll want to be sure of is that there aren't any issues caused by going directly to coreUrl rather then through the collection Url. For exampe, does replication within the shard still happen if you go directly to the coreUrl.
        Hide
        Marvin Justice added a comment -

        We see average indexing times drop from 9.2 ms to 6.7 ms. The overall timing distribution has a smaller tail for the patched version with stddev dropping from 15.4 ms to 9.0 ms.

        Good point about replication, I didn't actually test that. This work was done on our "alpha" cluster which is 64 shards on 16 nodes spread across 4 machines but with no replication.

        Show
        Marvin Justice added a comment - We see average indexing times drop from 9.2 ms to 6.7 ms. The overall timing distribution has a smaller tail for the patched version with stddev dropping from 15.4 ms to 9.0 ms. Good point about replication, I didn't actually test that. This work was done on our "alpha" cluster which is 64 shards on 16 nodes spread across 4 machines but with no replication.
        Hide
        Ramkumar Aiyengar added a comment -

        As far as I understand `DistributedUpdateProcessor`, indexing to the core itself would take care of replication just fine. But of course, a test confirming this will be good..

        Show
        Ramkumar Aiyengar added a comment - As far as I understand `DistributedUpdateProcessor`, indexing to the core itself would take care of replication just fine. But of course, a test confirming this will be good..
        Hide
        Marvin Justice added a comment -

        Just spun up a 4 shard x 2 replica test collection. Replication seems to be working just fine for me when going directly to the coreUrl.

        Show
        Marvin Justice added a comment - Just spun up a 4 shard x 2 replica test collection. Replication seems to be working just fine for me when going directly to the coreUrl.
        Hide
        Joel Bernstein added a comment -

        New patch using svn diff against trunk.

        Show
        Joel Bernstein added a comment - New patch using svn diff against trunk.
        Hide
        Joel Bernstein added a comment - - edited

        A review of the DistributedUpdateProcessor looks like it should handle going directly to the core.

        The SolrDispatchFilter does attempt to pick out the leader core for a collection. If there are two leaders from the same collection in the same Solr instance then it could pick the wrong leader and this would cause the routing on the server side.

        All tests are passing with the patch. I'll do some manual testing, but so far so good.

        Mark, let me know if you think I'm missing something.

        Show
        Joel Bernstein added a comment - - edited A review of the DistributedUpdateProcessor looks like it should handle going directly to the core. The SolrDispatchFilter does attempt to pick out the leader core for a collection. If there are two leaders from the same collection in the same Solr instance then it could pick the wrong leader and this would cause the routing on the server side. All tests are passing with the patch. I'll do some manual testing, but so far so good. Mark, let me know if you think I'm missing something.
        Hide
        Ramkumar Aiyengar added a comment -

        Joel, 4.8.1 candidate? I know this is tagged an enhancement but is a borderline bug really as it partially negates the purpose of direct updates..

        Show
        Ramkumar Aiyengar added a comment - Joel, 4.8.1 candidate? I know this is tagged an enhancement but is a borderline bug really as it partially negates the purpose of direct updates..
        Hide
        Mark Miller added a comment -

        Haven't had a chance to review the patch, but id consider this a minor bug, so 4.8.1 seems reasonable to me.

        Show
        Mark Miller added a comment - Haven't had a chance to review the patch, but id consider this a minor bug, so 4.8.1 seems reasonable to me.
        Hide
        Joel Bernstein added a comment -

        Still haven't had a chance to do the manual testing. I'll see if I can get this in in-time for 4.8.1.

        Show
        Joel Bernstein added a comment - Still haven't had a chance to do the manual testing. I'll see if I can get this in in-time for 4.8.1.
        Hide
        ASF subversion and git services added a comment -

        Commit 1593174 from Joel Bernstein in branch 'dev/trunk'
        [ https://svn.apache.org/r1593174 ]

        SOLR-6035: CloudSolrServer directUpdate routing should use getCoreUrl

        Show
        ASF subversion and git services added a comment - Commit 1593174 from Joel Bernstein in branch 'dev/trunk' [ https://svn.apache.org/r1593174 ] SOLR-6035 : CloudSolrServer directUpdate routing should use getCoreUrl
        Hide
        ASF subversion and git services added a comment -

        Commit 1593175 from Joel Bernstein in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1593175 ]

        SOLR-6035: CloudSolrServer directUpdate routing should use getCoreUrl

        Show
        ASF subversion and git services added a comment - Commit 1593175 from Joel Bernstein in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1593175 ] SOLR-6035 : CloudSolrServer directUpdate routing should use getCoreUrl
        Hide
        ASF subversion and git services added a comment -

        Commit 1593176 from Joel Bernstein in branch 'dev/branches/lucene_solr_4_8'
        [ https://svn.apache.org/r1593176 ]

        SOLR-6035: CloudSolrServer directUpdate routing should use getCoreUrl

        Show
        ASF subversion and git services added a comment - Commit 1593176 from Joel Bernstein in branch 'dev/branches/lucene_solr_4_8' [ https://svn.apache.org/r1593176 ] SOLR-6035 : CloudSolrServer directUpdate routing should use getCoreUrl

          People

          • Assignee:
            Joel Bernstein
            Reporter:
            Marvin Justice
          • Votes:
            1 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development