Solr
  1. Solr
  2. SOLR-3369

shards.tolerant=true broken on group queries

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.0-ALPHA
    • Fix Version/s: 4.4
    • Component/s: search
    • Labels:
      None
    • Environment:

      Distributed environment (shards)

      Description

      In a distributed environment, shards.tolerant=true allows for partial results to be returned when individual shards are down. For group=true and facet=true queries, using this feature results in an error when shards are down. This patch allows users to use the shard tolerance feature with facet and grouping queries.

        Issue Links

          Activity

          Hide
          Martijn van Groningen added a comment -

          I took a quick look. Patch also seems to add support for adding shards.info parameter for grouping and adds exception information to the response when a shard request fails.

          I don't that think that shards.tolerant is a supported feature in Solr 3.x versions (I couldn't find any reference of this param in the 3.6 source). So I think we shouldn't add this to the 3.6 branch since that would be adding a new feature to a Solr version that is actually in maintenance mode.

          Show
          Martijn van Groningen added a comment - I took a quick look. Patch also seems to add support for adding shards.info parameter for grouping and adds exception information to the response when a shard request fails. I don't that think that shards.tolerant is a supported feature in Solr 3.x versions (I couldn't find any reference of this param in the 3.6 source). So I think we shouldn't add this to the 3.6 branch since that would be adding a new feature to a Solr version that is actually in maintenance mode.
          Hide
          Russell Black added a comment -

          Martijn,

          You're right, 3.6 doesn't have this feature. My mistake. Just use the trunk patch if you wish.

          The reason I had a 3.6 version of this patch is that we have backported SOLR-3134 (original shards info/tolerance feature) to 3.6. SOLR-3369-shards-tolerant-3_6.patch is intended to be applied on top of that, and is of no use without it. I just posted our SOLR-3134 backport patch if you have an interest in adding shards.info and shards.tolerant to 3.x. If not, at least the patch will be available for other 3.x users who want that feature.

          Show
          Russell Black added a comment - Martijn, You're right, 3.6 doesn't have this feature. My mistake. Just use the trunk patch if you wish. The reason I had a 3.6 version of this patch is that we have backported SOLR-3134 (original shards info/tolerance feature) to 3.6. SOLR-3369 -shards-tolerant-3_6.patch is intended to be applied on top of that, and is of no use without it. I just posted our SOLR-3134 backport patch if you have an interest in adding shards.info and shards.tolerant to 3.x. If not, at least the patch will be available for other 3.x users who want that feature.
          Hide
          Russell Black added a comment -

          removed 3.6 from affected versions

          Show
          Russell Black added a comment - removed 3.6 from affected versions
          Hide
          Russell Black added a comment -

          The patch in SOLR-3557 contained some overlap with the patch on this ticket. I am updating the patch on this ticket accordingly.

          Show
          Russell Black added a comment - The patch in SOLR-3557 contained some overlap with the patch on this ticket. I am updating the patch on this ticket accordingly.
          Hide
          Ferry Landzaat added a comment -

          Is there any plan to fix this issue? We want to upgrade from 3.x and really need this patch to make the system reliable.

          Show
          Ferry Landzaat added a comment - Is there any plan to fix this issue? We want to upgrade from 3.x and really need this patch to make the system reliable.
          Hide
          André Bois-Crettez added a comment -

          Same here, with Solr 4.2.1 it is still not possible to do queries with both shards.tolerant=true and grouping.

          Is there anything wrong with the patch itself ?

          Show
          André Bois-Crettez added a comment - Same here, with Solr 4.2.1 it is still not possible to do queries with both shards.tolerant=true and grouping. Is there anything wrong with the patch itself ?
          Hide
          Ryan McKinley added a comment -

          If someone adds a test to exercise this patch, I'll commit it.

          Show
          Ryan McKinley added a comment - If someone adds a test to exercise this patch, I'll commit it.
          Hide
          Jabouille jean Charles added a comment -

          Hi,

          here is a test, you just have to copy it in solr/core/src/test/org/apache/solr/:

          TestDistributedGroupingWithShardTolerantActivated.java
          package org.apache.solr;
          
          import java.util.ArrayList;
          import java.util.Arrays;
          import java.util.List;
          
          import org.apache.lucene.util.LuceneTestCase.Slow;
          import org.apache.solr.client.solrj.SolrServer;
          import org.apache.solr.client.solrj.SolrServerException;
          import org.apache.solr.client.solrj.embedded.JettySolrRunner;
          import org.apache.solr.cloud.ChaosMonkey;
          import org.apache.solr.common.params.CommonParams;
          import org.apache.solr.common.params.ModifiableSolrParams;
          import org.apache.solr.common.params.ShardParams;
          
          
          @Slow
          public class TestDistributedGroupingWithShardTolerantActivated extends BaseDistributedSearchTestCase {
          
            String t1="a_t";
            String i1="a_si";
            String s1="a_s";
            String tlong = "other_tl1";
            String tdate_a = "a_n_tdt";
            String tdate_b = "b_n_tdt";
            String oddField="oddField_s";
          
            @Override
            public void doTest() throws Exception {
              del("*:*");
              commit();
          
              handle.clear();
              handle.put("QTime", SKIPVAL);
              handle.put("timestamp", SKIPVAL);
              handle.put("grouped", UNORDERED);   // distrib grouping doesn't guarantee order of top level group commands
          
              indexr(id,1, i1, 100, tlong, 100,t1,"now is the time for all good men",
                     tdate_a, "2010-04-20T11:00:00Z",
                     tdate_b, "2009-08-20T11:00:00Z",
                     "foo_f", 1.414f, "foo_b", "true", "foo_d", 1.414d);
              indexr(id,2, i1, 50 , tlong, 50,t1,"to come to the aid of their country.",
                     tdate_a, "2010-05-02T11:00:00Z",
                     tdate_b, "2009-11-02T11:00:00Z");
              indexr(id,3, i1, 2, tlong, 2,t1,"how now brown cow",
                     tdate_a, "2010-05-03T11:00:00Z");
              indexr(id,4, i1, -100 ,tlong, 101,
                     t1,"the quick fox jumped over the lazy dog",
                     tdate_a, "2010-05-03T11:00:00Z",
                     tdate_b, "2010-05-03T11:00:00Z");
              indexr(id,5, i1, 500, tlong, 500 ,
                     t1,"the quick fox jumped way over the lazy dog",
                     tdate_a, "2010-05-05T11:00:00Z");
              indexr(id,6, i1, -600, tlong, 600 ,t1,"humpty dumpy sat on a wall");
              indexr(id,7, i1, 123, tlong, 123 ,t1,"humpty dumpy had a great fall");
              indexr(id,8, i1, 876, tlong, 876,
                     tdate_b, "2010-01-05T11:00:00Z",
                     t1,"all the kings horses and all the kings men");
              indexr(id,9, i1, 7, tlong, 7,t1,"couldn't put humpty together again");
              indexr(id,10, i1, 4321, tlong, 4321,t1,"this too shall pass");
              indexr(id,11, i1, -987, tlong, 987,
                     t1,"An eye for eye only ends up making the whole world blind.");
              indexr(id,12, i1, 379, tlong, 379,
                     t1,"Great works are performed, not by strength, but by perseverance.");
          
              indexr(id, 14, "SubjectTerms_mfacet", new String[]  {"mathematical models", "mathematical analysis"});
              indexr(id, 15, "SubjectTerms_mfacet", new String[]  {"test 1", "test 2", "test3"});
              indexr(id, 16, "SubjectTerms_mfacet", new String[]  {"test 1", "test 2", "test3"});
              String[] vals = new String[100];
              for (int i=0; i<100; i++) {
                vals[i] = "test " + i;
              }
              indexr(id, 17, "SubjectTerms_mfacet", vals);
          
              indexr(
                  id, 18, i1, 232, tlong, 332,
                  t1,"no eggs on wall, lesson learned",
                  oddField, "odd man out"
              );
              indexr(
                  id, 19, i1, 232, tlong, 432,
                  t1, "many eggs on wall",
                  oddField, "odd man in"
              );
              indexr(
                  id, 20, i1, 232, tlong, 532,
                  t1, "some eggs on wall",
                  oddField, "odd man between"
              );
              indexr(
                  id, 21, i1, 232, tlong, 632,
                  t1, "a few eggs on wall",
                  oddField, "odd man under"
              );
              indexr(
                  id, 22, i1, 232, tlong, 732,
                  t1, "any eggs on wall",
                  oddField, "odd man above"
              );
              indexr(
                  id, 23, i1, 233, tlong, 734,
                  t1, "dirty eggs",
                  oddField, "odd eggs"
              );
          
              for (int i = 100; i < 150; i++) {
                indexr(id, i);
              }
          
              int[] values = new int[]{9999, 99999, 999999, 9999999};
              for (int shard = 0; shard < clients.size(); shard++) {
                int groupValue = values[shard];
                for (int i = 500; i < 600; i++) {
                  index_specific(shard, i1, groupValue, s1, "a", id, i * (shard + 1), t1, shard);
                }
              }
          
              commit();
              
              // SOLR-3369: shrds.tolreant=true with grouping
              for (int numDownServers = 0; numDownServers < jettys.size() - 1; numDownServers++) {
                List<JettySolrRunner> upJettys = new ArrayList<JettySolrRunner>(jettys);
                List<SolrServer> upClients = new ArrayList<SolrServer>(clients);
                List<JettySolrRunner> downJettys = new ArrayList<JettySolrRunner>();
                List<String> upShards = new ArrayList<String>(Arrays.asList(shardsArr));
                for (int i = 0; i < numDownServers; i++) {
                  // shut down some of the jettys
                  int indexToRemove = r.nextInt(upJettys.size());
                  JettySolrRunner downJetty = upJettys.remove(indexToRemove);
                  upClients.remove(indexToRemove);
                  upShards.remove(indexToRemove);
                  ChaosMonkey.stop(downJetty);
                  downJettys.add(downJetty);
                }
          
                simpleQuery("q", "*:*", "rows", 100, "fl",
                    "id," + i1, "group", "true", "group.query", t1 + ":kings OR " + t1
                        + ":eggs", "group.limit", 10, "sort", i1 + " asc, id asc",
                    CommonParams.TIME_ALLOWED, 1, ShardParams.SHARDS_INFO, "true",
                    ShardParams.SHARDS_TOLERANT, "true");
                
                // restart the jettys
                for (JettySolrRunner downJetty : downJettys) {
                  downJetty.start();
                }
              }
            }
          
            private void simpleQuery(Object... queryParams) throws SolrServerException {
              ModifiableSolrParams params = new ModifiableSolrParams();
              for (int i = 0; i < queryParams.length; i += 2) {
                params.add(queryParams[i].toString(), queryParams[i + 1].toString());
              }
              params.set("shards", shards);
              queryServer(params);
            }
          
          }
          

          Your patch works perfectly in "select" mode. In "browse" mode there in a other exception:

          ERROR org.apache.solr.core.SolrCore:log:85  - java.lang.NullPointerException
              at org.apache.solr.handler.component.SpellCheckComponent.finishStage(SpellCheckComponent.java:297)
              at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:317)
          

          Thank you very much for this patch, we hope it will be in the next release of solr.

          Show
          Jabouille jean Charles added a comment - Hi, here is a test, you just have to copy it in solr/core/src/test/org/apache/solr/: TestDistributedGroupingWithShardTolerantActivated.java package org.apache.solr; import java.util.ArrayList; import java.util.Arrays; import java.util.List; import org.apache.lucene.util.LuceneTestCase.Slow; import org.apache.solr.client.solrj.SolrServer; import org.apache.solr.client.solrj.SolrServerException; import org.apache.solr.client.solrj.embedded.JettySolrRunner; import org.apache.solr.cloud.ChaosMonkey; import org.apache.solr.common.params.CommonParams; import org.apache.solr.common.params.ModifiableSolrParams; import org.apache.solr.common.params.ShardParams; @Slow public class TestDistributedGroupingWithShardTolerantActivated extends BaseDistributedSearchTestCase { String t1= "a_t" ; String i1= "a_si" ; String s1= "a_s" ; String tlong = "other_tl1" ; String tdate_a = "a_n_tdt" ; String tdate_b = "b_n_tdt" ; String oddField= "oddField_s" ; @Override public void doTest() throws Exception { del( "*:*" ); commit(); handle.clear(); handle.put( "QTime" , SKIPVAL); handle.put( "timestamp" , SKIPVAL); handle.put( "grouped" , UNORDERED); // distrib grouping doesn't guarantee order of top level group commands indexr(id,1, i1, 100, tlong, 100,t1, "now is the time for all good men" , tdate_a, "2010-04-20T11:00:00Z" , tdate_b, "2009-08-20T11:00:00Z" , "foo_f" , 1.414f, "foo_b" , " true " , "foo_d" , 1.414d); indexr(id,2, i1, 50 , tlong, 50,t1, "to come to the aid of their country." , tdate_a, "2010-05-02T11:00:00Z" , tdate_b, "2009-11-02T11:00:00Z" ); indexr(id,3, i1, 2, tlong, 2,t1, "how now brown cow" , tdate_a, "2010-05-03T11:00:00Z" ); indexr(id,4, i1, -100 ,tlong, 101, t1, "the quick fox jumped over the lazy dog" , tdate_a, "2010-05-03T11:00:00Z" , tdate_b, "2010-05-03T11:00:00Z" ); indexr(id,5, i1, 500, tlong, 500 , t1, "the quick fox jumped way over the lazy dog" , tdate_a, "2010-05-05T11:00:00Z" ); indexr(id,6, i1, -600, tlong, 600 ,t1, "humpty dumpy sat on a wall" ); indexr(id,7, i1, 123, tlong, 123 ,t1, "humpty dumpy had a great fall" ); indexr(id,8, i1, 876, tlong, 876, tdate_b, "2010-01-05T11:00:00Z" , t1, "all the kings horses and all the kings men" ); indexr(id,9, i1, 7, tlong, 7,t1, "couldn't put humpty together again" ); indexr(id,10, i1, 4321, tlong, 4321,t1, " this too shall pass" ); indexr(id,11, i1, -987, tlong, 987, t1, "An eye for eye only ends up making the whole world blind." ); indexr(id,12, i1, 379, tlong, 379, t1, "Great works are performed, not by strength, but by perseverance." ); indexr(id, 14, "SubjectTerms_mfacet" , new String [] { "mathematical models" , "mathematical analysis" }); indexr(id, 15, "SubjectTerms_mfacet" , new String [] { "test 1" , "test 2" , "test3" }); indexr(id, 16, "SubjectTerms_mfacet" , new String [] { "test 1" , "test 2" , "test3" }); String [] vals = new String [100]; for ( int i=0; i<100; i++) { vals[i] = "test " + i; } indexr(id, 17, "SubjectTerms_mfacet" , vals); indexr( id, 18, i1, 232, tlong, 332, t1, "no eggs on wall, lesson learned" , oddField, "odd man out" ); indexr( id, 19, i1, 232, tlong, 432, t1, "many eggs on wall" , oddField, "odd man in" ); indexr( id, 20, i1, 232, tlong, 532, t1, "some eggs on wall" , oddField, "odd man between" ); indexr( id, 21, i1, 232, tlong, 632, t1, "a few eggs on wall" , oddField, "odd man under" ); indexr( id, 22, i1, 232, tlong, 732, t1, "any eggs on wall" , oddField, "odd man above" ); indexr( id, 23, i1, 233, tlong, 734, t1, "dirty eggs" , oddField, "odd eggs" ); for ( int i = 100; i < 150; i++) { indexr(id, i); } int [] values = new int []{9999, 99999, 999999, 9999999}; for ( int shard = 0; shard < clients.size(); shard++) { int groupValue = values[shard]; for ( int i = 500; i < 600; i++) { index_specific(shard, i1, groupValue, s1, "a" , id, i * (shard + 1), t1, shard); } } commit(); // SOLR-3369: shrds.tolreant= true with grouping for ( int numDownServers = 0; numDownServers < jettys.size() - 1; numDownServers++) { List<JettySolrRunner> upJettys = new ArrayList<JettySolrRunner>(jettys); List<SolrServer> upClients = new ArrayList<SolrServer>(clients); List<JettySolrRunner> downJettys = new ArrayList<JettySolrRunner>(); List< String > upShards = new ArrayList< String >(Arrays.asList(shardsArr)); for ( int i = 0; i < numDownServers; i++) { // shut down some of the jettys int indexToRemove = r.nextInt(upJettys.size()); JettySolrRunner downJetty = upJettys.remove(indexToRemove); upClients.remove(indexToRemove); upShards.remove(indexToRemove); ChaosMonkey.stop(downJetty); downJettys.add(downJetty); } simpleQuery( "q" , "*:*" , "rows" , 100, "fl" , "id," + i1, "group" , " true " , "group.query" , t1 + ":kings OR " + t1 + ":eggs" , "group.limit" , 10, "sort" , i1 + " asc, id asc" , CommonParams.TIME_ALLOWED, 1, ShardParams.SHARDS_INFO, " true " , ShardParams.SHARDS_TOLERANT, " true " ); // restart the jettys for (JettySolrRunner downJetty : downJettys) { downJetty.start(); } } } private void simpleQuery( Object ... queryParams) throws SolrServerException { ModifiableSolrParams params = new ModifiableSolrParams(); for ( int i = 0; i < queryParams.length; i += 2) { params.add(queryParams[i].toString(), queryParams[i + 1].toString()); } params.set( "shards" , shards); queryServer(params); } } Your patch works perfectly in "select" mode. In "browse" mode there in a other exception: ERROR org.apache.solr.core.SolrCore:log:85 - java.lang.NullPointerException at org.apache.solr.handler.component.SpellCheckComponent.finishStage(SpellCheckComponent.java:297) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:317) Thank you very much for this patch, we hope it will be in the next release of solr .
          Hide
          Russell Black added a comment - - edited

          This test case fails sometimes because it occasionally tries to use one of the "down" servers as the collator. I'll fix this and include it in a new patch.

          Show
          Russell Black added a comment - - edited This test case fails sometimes because it occasionally tries to use one of the "down" servers as the collator. I'll fix this and include it in a new patch.
          Hide
          Russell Black added a comment - - edited

          As requested, I added a test case to the patch file.

          Show
          Russell Black added a comment - - edited As requested, I added a test case to the patch file.
          Hide
          Ryan McKinley added a comment -

          latest patch adds:

          +      // test group query
          +      queryPartialResults(upShards, upClients, 
          +          "q", "*:*", 
          +          "rows", 100, 
          +          "fl", "id," + i1, 
          +          "group", "true", 
          +          "group.query", t1 + ":kings OR " + t1 + ":eggs", 
          +          "group.limit", 10, 
          +          "sort", i1 + " asc, id asc",
          +          CommonParams.TIME_ALLOWED, 1, 
          +          ShardParams.SHARDS_INFO, "true",
          +          ShardParams.SHARDS_TOLERANT, "true");
          

          but does not include TestDistributedGroupingWithShardTolerantActivated.java

          Is this intentional?

          Show
          Ryan McKinley added a comment - latest patch adds: + // test group query + queryPartialResults(upShards, upClients, + "q" , "*:*" , + "rows" , 100, + "fl" , "id," + i1, + "group" , " true " , + "group.query" , t1 + ":kings OR " + t1 + ":eggs" , + "group.limit" , 10, + "sort" , i1 + " asc, id asc" , + CommonParams.TIME_ALLOWED, 1, + ShardParams.SHARDS_INFO, " true " , + ShardParams.SHARDS_TOLERANT, " true " ); but does not include TestDistributedGroupingWithShardTolerantActivated.java Is this intentional?
          Hide
          Russell Black added a comment -

          Yes, this is intentional. TestDistributedGroupingWithShardTolerantActivated.java duplicated much of what was in TestDistributedSearch.java. TestDistributedSearch.java already has the framework in place for dealing with partial results.

          Show
          Russell Black added a comment - Yes, this is intentional. TestDistributedGroupingWithShardTolerantActivated.java duplicated much of what was in TestDistributedSearch.java. TestDistributedSearch.java already has the framework in place for dealing with partial results.
          Hide
          Russell Black added a comment -

          Ryan, is there anything preventing this patch from committing? Was the test case satisfactory?

          Show
          Russell Black added a comment - Ryan, is there anything preventing this patch from committing? Was the test case satisfactory?
          Hide
          Shalin Shekhar Mangar added a comment -

          This patch looks alright on first pass. There's something wrong with the solrcloud tests in trunk right now (unrelated to the patch). I'll wait for it to be fixed and then commit once I get a clean pass on all tests.

          Show
          Shalin Shekhar Mangar added a comment - This patch looks alright on first pass. There's something wrong with the solrcloud tests in trunk right now (unrelated to the patch). I'll wait for it to be fixed and then commit once I get a clean pass on all tests.
          Hide
          ASF subversion and git services added a comment -

          Commit 1498992 from shalin@apache.org
          [ https://svn.apache.org/r1498992 ]

          SOLR-3369: shards.tolerant=true is broken for group queries

          Show
          ASF subversion and git services added a comment - Commit 1498992 from shalin@apache.org [ https://svn.apache.org/r1498992 ] SOLR-3369 : shards.tolerant=true is broken for group queries
          Hide
          ASF subversion and git services added a comment -

          Commit 1498993 from shalin@apache.org
          [ https://svn.apache.org/r1498993 ]

          SOLR-3369: shards.tolerant=true is broken for group queries

          Show
          ASF subversion and git services added a comment - Commit 1498993 from shalin@apache.org [ https://svn.apache.org/r1498993 ] SOLR-3369 : shards.tolerant=true is broken for group queries
          Hide
          Shalin Shekhar Mangar added a comment -

          shards.tolerant already works for facet queries so this issue is only about group queries.

          Show
          Shalin Shekhar Mangar added a comment - shards.tolerant already works for facet queries so this issue is only about group queries.
          Hide
          Steve Rowe added a comment -

          Bulk close resolved 4.4 issues

          Show
          Steve Rowe added a comment - Bulk close resolved 4.4 issues

            People

            • Assignee:
              Shalin Shekhar Mangar
              Reporter:
              Russell Black
            • Votes:
              5 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development