Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-8988

Improve facet.method=fcs performance in SolrCloud

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 5.5, 6.0
    • Fix Version/s: 6.1
    • Component/s: None
    • Labels:
      None

      Description

      This relates to SOLR-8559 – which improves the algorithm used by fcs faceting when facet.mincount=1

      This patch allows facet.mincount to be sent as 1 for distributed queries. As far as I can tell there is no reason to set facet.mincount=0 for refinement purposes . After trying to make sense of all the refinement logic, I cant see how the difference between no value and value=0 would have a negative effect.

      Test perf:

      • ~15million unique terms
      • query matches ~3million documents

      Params:

      facet.mincount=1
      facet.limit=500
      facet.method=fcs
      facet.sort=count
      

      Average Time Per Request:

      • Before patch: ~20seconds
      • After patch: <1 second

      Note: all tests pass and in my test, the output was identical before and after patch.

      1. Screen Shot 2016-04-25 at 2.54.47 PM.png
        95 kB
        Keith Laban
      2. Screen Shot 2016-04-25 at 2.55.00 PM.png
        118 kB
        Keith Laban
      3. SOLR-8988.patch
        6 kB
        Keith Laban
      4. SOLR-8988.patch
        5 kB
        Dennis Gove
      5. SOLR-8988.patch
        5 kB
        Keith Laban
      6. SOLR-8988.patch
        1 kB
        Keith Laban

        Activity

        Hide
        k317h Keith Laban added a comment -

        I'm not sure who would be best to look at this. Maybe Yonik Seeley or Erik Eriksson would be more familiar with this code path. Is there any reason this wouldn't work?

        Show
        k317h Keith Laban added a comment - I'm not sure who would be best to look at this. Maybe Yonik Seeley or Erik Eriksson would be more familiar with this code path. Is there any reason this wouldn't work?
        Hide
        dragonsinth Scott Blum added a comment -

        +1, we ran across this too, and couldn't think of any reason for the inefficiency here

        Show
        dragonsinth Scott Blum added a comment - +1, we ran across this too, and couldn't think of any reason for the inefficiency here
        Hide
        hossman Hoss Man added a comment -

        As far as I can tell there is no reason to set facet.mincount=0 for refinement purposes . After trying to make sense of all the refinement logic, I cant see how the difference between no value and value=0 would have a negative

        i haven't looked closely, but IIRC the justification for this comment...

        -          dff.initialMincount = 0; // TODO: we could change this to 1, but would
        -                                   // then need more refinement for small facet
        -                                   // result sets?
        

        is that if you get back a count of foo=0 from shardA, and if foo winds up being a candidate term for the final topN list because of it's count on other shards, then you know definitively that you don't have to ask shardA to provide a refinement value for "foo" - you already know it's count.

        which behavior is more performant in the most common cases? ... i have no idea off the top of my head ... i'd have ot really sit down and think about all the variables.

        what would probably make the most sense is to add an expert level option for controlling this (similar to the overrequest options) and leave the default as it is for now – that way people have one more knob they can try turning to tune performance, and if we decide later that the default behavior should be changed in the common case, it's easy to do.

        Show
        hossman Hoss Man added a comment - As far as I can tell there is no reason to set facet.mincount=0 for refinement purposes . After trying to make sense of all the refinement logic, I cant see how the difference between no value and value=0 would have a negative i haven't looked closely, but IIRC the justification for this comment... - dff.initialMincount = 0; // TODO: we could change this to 1, but would - // then need more refinement for small facet - // result sets? is that if you get back a count of foo=0 from shardA, and if foo winds up being a candidate term for the final topN list because of it's count on other shards, then you know definitively that you don't have to ask shardA to provide a refinement value for "foo" - you already know it's count. which behavior is more performant in the most common cases? ... i have no idea off the top of my head ... i'd have ot really sit down and think about all the variables. what would probably make the most sense is to add an expert level option for controlling this (similar to the overrequest options) and leave the default as it is for now – that way people have one more knob they can try turning to tune performance, and if we decide later that the default behavior should be changed in the common case, it's easy to do.
        Hide
        hossman Hoss Man added a comment -
        • ~15million unique terms
        • query matches ~3million documents

        Other key factors here are going to be:

        • num shards
        • num docs per shard
        • num terms in field
        • num terms with non-zero facet counts for docs matching query on a per shard basis
        • how much variance there is in the num terms with non-zero facet counts for docs matching query on a per shard basis
        Show
        hossman Hoss Man added a comment - ~15million unique terms query matches ~3million documents Other key factors here are going to be: num shards num docs per shard num terms in field num terms with non-zero facet counts for docs matching query on a per shard basis how much variance there is in the num terms with non-zero facet counts for docs matching query on a per shard basis
        Hide
        k317h Keith Laban added a comment -

        For clarity of this test:

        num shards

        • 12

        num docs per shard

        • ~70 million

        num terms in field

        • ~15 million

        num terms with non-zero facet counts for docs matching query on a per shard basis

        • ~90k

        how much variance there is in the num terms with non-zero facet counts for docs matching query on a per shard basis

        • evenly distributed

        ...is that if you get back a count of foo=0 from shardA, and if foo winds up being a candidate term for the final topN list because of it's count on other shards, then you know definitively that you don't have to ask shardA to provide a refinement value for "foo" - you already know it's count.

        This is the part that I would argue doesn't matter. Consider you asked for 10 terms from shardA with mincount =1 and you received only 5 terms back. Then you know that if foo was in shardB, but not in shardA the maximum count it could have had in shardA was 0, otherwise it would have been returned in the initial request.

        On the other hand if you ask for 10 terms with mincount=1 and you get back 10 terms with a count >=1 well the response back would have been identical if mincount=0.

        Logic aids refinement pulled from – FacetComponent.DistributedFieldFacet

            void add(int shardNum, NamedList shardCounts, int numRequested) {
              // shardCounts could be null if there was an exception
              int sz = shardCounts == null ? 0 : shardCounts.size();
              int numReceived = sz;
              
              FixedBitSet terms = new FixedBitSet(termNum + sz);
        
              long last = 0;
              for (int i = 0; i < sz; i++) {
                String name = shardCounts.getName(i);
                long count = ((Number) shardCounts.getVal(i)).longValue();
                if (name == null) {
                  missingCount += count;
                  numReceived--;
                } else {
                  ShardFacetCount sfc = counts.get(name);
                  if (sfc == null) {
                    sfc = new ShardFacetCount();
                    sfc.name = name;
                    sfc.indexed = ftype == null ? sfc.name : ftype.toInternal(sfc.name);
                    sfc.termNum = termNum++;
                    counts.put(name, sfc);
                  }
                  sfc.count += count;
                  terms.set(sfc.termNum);
                  last = count;
                }
              }
              
              // the largest possible missing term is initialMincount if we received
              // less than the number requested.
              if (numRequested < 0 || numRequested != 0 && numReceived < numRequested) {
                last = initialMincount;
              }
              
              missingMaxPossible += last;
              missingMax[shardNum] = last;
              counted[shardNum] = terms;
            }
        

        However I think this line block should also be changed.

              if (numRequested < 0 || numRequested != 0 && numReceived < numRequested) {
                last = Math.max(initialMincount-1, 0);
              }
        
        Show
        k317h Keith Laban added a comment - For clarity of this test: num shards 12 num docs per shard ~70 million num terms in field ~15 million num terms with non-zero facet counts for docs matching query on a per shard basis ~90k how much variance there is in the num terms with non-zero facet counts for docs matching query on a per shard basis evenly distributed ...is that if you get back a count of foo=0 from shardA, and if foo winds up being a candidate term for the final topN list because of it's count on other shards, then you know definitively that you don't have to ask shardA to provide a refinement value for "foo" - you already know it's count. This is the part that I would argue doesn't matter. Consider you asked for 10 terms from shardA with mincount =1 and you received only 5 terms back. Then you know that if foo was in shardB, but not in shardA the maximum count it could have had in shardA was 0, otherwise it would have been returned in the initial request. On the other hand if you ask for 10 terms with mincount=1 and you get back 10 terms with a count >=1 well the response back would have been identical if mincount=0. Logic aids refinement pulled from – FacetComponent.DistributedFieldFacet void add( int shardNum, NamedList shardCounts, int numRequested) { // shardCounts could be null if there was an exception int sz = shardCounts == null ? 0 : shardCounts.size(); int numReceived = sz; FixedBitSet terms = new FixedBitSet(termNum + sz); long last = 0; for ( int i = 0; i < sz; i++) { String name = shardCounts.getName(i); long count = (( Number ) shardCounts.getVal(i)).longValue(); if (name == null ) { missingCount += count; numReceived--; } else { ShardFacetCount sfc = counts.get(name); if (sfc == null ) { sfc = new ShardFacetCount(); sfc.name = name; sfc.indexed = ftype == null ? sfc.name : ftype.toInternal(sfc.name); sfc.termNum = termNum++; counts.put(name, sfc); } sfc.count += count; terms.set(sfc.termNum); last = count; } } // the largest possible missing term is initialMincount if we received // less than the number requested. if (numRequested < 0 || numRequested != 0 && numReceived < numRequested) { last = initialMincount; } missingMaxPossible += last; missingMax[shardNum] = last; counted[shardNum] = terms; } However I think this line block should also be changed. if (numRequested < 0 || numRequested != 0 && numReceived < numRequested) { last = Math .max(initialMincount-1, 0); }
        Hide
        k317h Keith Laban added a comment -

        Hoss Man can I convince you that this should be the default behavior?

        Show
        k317h Keith Laban added a comment - Hoss Man can I convince you that this should be the default behavior?
        Hide
        hossman Hoss Man added a comment -

        You've convinced me that i don't understand the point behind that existing TODO: we could change this to 1... comment, but I still want to review the code more thoroughly before i'm confident enough to concede your approach is better in all cases.

        That said: If you updated your patch to make it optional based on a param w/some tests that randomly toggled the value (TestCloudPivotFacet, DistributedFacetPivotLongTailTest would be good ones) then i'd probably be game to commit even w/o being confident it's better in all cases, and we could worry about changing the default later.

        However I think this line block should also be changed.

        Hmm, yeah ... that does smell like it could be optimized.

        (FWIW: we have a TrackingShardHandlerFactory that can be used in tests to make assertions about what per-shard requests solr triggers. That can be used along with some carefully crafted shards/docs/requests to verify that no unnecessary refinement is done in cases where you don't expect it – like with this initialMincount vs initialMincount-1 situation)

        Show
        hossman Hoss Man added a comment - You've convinced me that i don't understand the point behind that existing TODO: we could change this to 1... comment, but I still want to review the code more thoroughly before i'm confident enough to concede your approach is better in all cases. That said: If you updated your patch to make it optional based on a param w/some tests that randomly toggled the value (TestCloudPivotFacet, DistributedFacetPivotLongTailTest would be good ones) then i'd probably be game to commit even w/o being confident it's better in all cases, and we could worry about changing the default later. However I think this line block should also be changed. Hmm, yeah ... that does smell like it could be optimized. (FWIW: we have a TrackingShardHandlerFactory that can be used in tests to make assertions about what per-shard requests solr triggers. That can be used along with some carefully crafted shards/docs/requests to verify that no unnecessary refinement is done in cases where you don't expect it – like with this initialMincount vs initialMincount-1 situation)
        Hide
        k317h Keith Laban added a comment - - edited

        Added second version of patch which has this feature disabled by default but can be enabled with facet.distrib.mco=true.

        I also did some benchmarking and under all scenarios tested the new way is either the same or way faster. The test was with 12 shards everything evenly distributed.

        Two things to note about this test:

        • All terms have the same count which would be the worst case for refinement which is evident in the shape of each graph. Overrequesting is far more efficient.
        • All segments are evenly distributed however in the real world, the best performance gains for this patch would be seen when there are many segments which contain no relevant terms for the query.

        More details about the test.

        • 2 node cloud running locally each with 4g
        • 12 shards without replication (only 12 total cores)
        • terms were integers with doc values enabled
        • instances were restarted after each test to avoid lingering GC issues, however each test had some warmup queries before running the test
        • The Y-axis is average QTime(ms) over 100 test runs
        Show
        k317h Keith Laban added a comment - - edited Added second version of patch which has this feature disabled by default but can be enabled with facet.distrib.mco=true . I also did some benchmarking and under all scenarios tested the new way is either the same or way faster. The test was with 12 shards everything evenly distributed. Two things to note about this test: All terms have the same count which would be the worst case for refinement which is evident in the shape of each graph. Overrequesting is far more efficient. All segments are evenly distributed however in the real world, the best performance gains for this patch would be seen when there are many segments which contain no relevant terms for the query. More details about the test. 2 node cloud running locally each with 4g 12 shards without replication (only 12 total cores) terms were integers with doc values enabled instances were restarted after each test to avoid lingering GC issues, however each test had some warmup queries before running the test The Y-axis is average QTime(ms) over 100 test runs
        Hide
        k317h Keith Laban added a comment -

        Hoss Man how does the updated patch look?

        Show
        k317h Keith Laban added a comment - Hoss Man how does the updated patch look?
        Hide
        dpgove Dennis Gove added a comment -

        Hoss Man, do you still have concerns on this patch? I think it's a good change to make and I'm happy to take on the committing if you don't have any further concerns.

        Show
        dpgove Dennis Gove added a comment - Hoss Man , do you still have concerns on this patch? I think it's a good change to make and I'm happy to take on the committing if you don't have any further concerns.
        Hide
        hossman Hoss Man added a comment -

        I haven't had time to review it enough to be confident enough that I'd want to commit it myself – but if you have then go for it, i'm +0.

        My one bit of feedback fro ma quick skim of the patch is that i don't understand the javadocs for "FACET_DISTRIB_MCO" at all ... it's a boolean param, but the docs describe it as " The default mincount to request on distributed facet queries" which makes it sound like a number, and the "Default values" bit of the javadocs don't relaly do anything to clarify that confusion since they also (appear to) talk about the (eventual) distributed mincount, and not the default value of the "FACET_DISTRIB_MCO" param itself

        Show
        hossman Hoss Man added a comment - I haven't had time to review it enough to be confident enough that I'd want to commit it myself – but if you have then go for it, i'm +0. My one bit of feedback fro ma quick skim of the patch is that i don't understand the javadocs for "FACET_DISTRIB_MCO" at all ... it's a boolean param, but the docs describe it as " The default mincount to request on distributed facet queries" which makes it sound like a number, and the "Default values" bit of the javadocs don't relaly do anything to clarify that confusion since they also (appear to) talk about the (eventual) distributed mincount, and not the default value of the "FACET_DISTRIB_MCO" param itself
        Hide
        dpgove Dennis Gove added a comment -

        Keith Laban, could you explain the javadocs on FACET_DISTRIB_MCO a little bit more? I don't quite follow the documentation on it

        +  public static final String FACET_DISTRIB = FACET + ".distrib";
        +
        +  /**
        +   * The default mincount to request on distributed facet queries.
        +   * This param only applies to COUNT sorted queries which have a limit &gt; -1
        +   *
        +   * Default values:
        +   * Sort COUNT and facet.limit = -1: Math.min(facet.minCount, 1)
        +   * Sort COUNT and facet.limit &gt; 0: 0
        +   * Sort INDEX and facet.mincount &lt;= 1: facet.mincount
        +   * Sort INDEX and facet.mincount &gt; 1: (int) Math.ceil((double) dff.minCount / rb.slices.length)
        +   *
        +   * EXPERT
        +   */
        +
        +  public static final String FACET_DISTRIB_MCO = FACET_DISTRIB + ".mco";
        +
        
        Show
        dpgove Dennis Gove added a comment - Keith Laban , could you explain the javadocs on FACET_DISTRIB_MCO a little bit more? I don't quite follow the documentation on it + public static final String FACET_DISTRIB = FACET + ".distrib" ; + + /** + * The default mincount to request on distributed facet queries. + * This param only applies to COUNT sorted queries which have a limit &gt; -1 + * + * Default values: + * Sort COUNT and facet.limit = -1: Math .min(facet.minCount, 1) + * Sort COUNT and facet.limit &gt; 0: 0 + * Sort INDEX and facet.mincount &lt;= 1: facet.mincount + * Sort INDEX and facet.mincount &gt; 1: ( int ) Math .ceil(( double ) dff.minCount / rb.slices.length) + * + * EXPERT + */ + + public static final String FACET_DISTRIB_MCO = FACET_DISTRIB + ".mco" ; +
        Hide
        dpgove Dennis Gove added a comment -

        Added some comments around the use of dff.mco option. Still looking for clarification on the docs around FACET_DISTRIB_MCO.

        Show
        dpgove Dennis Gove added a comment - Added some comments around the use of dff.mco option. Still looking for clarification on the docs around FACET_DISTRIB_MCO.
        Hide
        k317h Keith Laban added a comment -

        Fixed docs

        Show
        k317h Keith Laban added a comment - Fixed docs
        Hide
        dpgove Dennis Gove added a comment -

        Just to slightly rephrase the salient point here:

        Consider you asked for up to 10 terms from shardA with mincount=1 but you received only 5 terms back. In this case you know, definitively, that a term seen in the response from shardB but not in the response from shardA could have at most a count of 0 in shardA. If it had any other count in shardA then it would have been returned in the response from shardA.

        Also, if you asked for up to 10 terms from shardA with mincount=1 and you get back a response with 10 terms having a count >= 1 then the response is identical to the one you'd have received if mincount=0.

        Because of this, there isn't a scenario where the response would result in more work than would have been required if mincount=0. For this reason, the decrease in required work when mincount=1 is always either a moot point or a net win.

        Show
        dpgove Dennis Gove added a comment - Just to slightly rephrase the salient point here: Consider you asked for up to 10 terms from shardA with mincount=1 but you received only 5 terms back. In this case you know, definitively, that a term seen in the response from shardB but not in the response from shardA could have at most a count of 0 in shardA. If it had any other count in shardA then it would have been returned in the response from shardA. Also, if you asked for up to 10 terms from shardA with mincount=1 and you get back a response with 10 terms having a count >= 1 then the response is identical to the one you'd have received if mincount=0. Because of this, there isn't a scenario where the response would result in more work than would have been required if mincount=0. For this reason, the decrease in required work when mincount=1 is always either a moot point or a net win.
        Hide
        dpgove Dennis Gove added a comment -

        I'm going to commit this with the default left as is (facet.mincount=0) because this new option will be defaulted to false. I've entered SOLR-9152 to discuss and handle changing the default. I believe it is safe to do so.

        Show
        dpgove Dennis Gove added a comment - I'm going to commit this with the default left as is (facet.mincount=0) because this new option will be defaulted to false. I've entered SOLR-9152 to discuss and handle changing the default. I believe it is safe to do so.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit e4e990b993d6872f6345b7d064efb8ca22ee6556 in lucene-solr's branch refs/heads/master from Dennis Gove
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e4e990b ]

        SOLR-8988: Adds query option facet.distrib.mco which when set to true allows the use of facet.mincount=1 in cloud mode

        Show
        jira-bot ASF subversion and git services added a comment - Commit e4e990b993d6872f6345b7d064efb8ca22ee6556 in lucene-solr's branch refs/heads/master from Dennis Gove [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e4e990b ] SOLR-8988 : Adds query option facet.distrib.mco which when set to true allows the use of facet.mincount=1 in cloud mode
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit ab87a0e75641d3e4076b9f4c247339f9d9c47103 in lucene-solr's branch refs/heads/branch_6x from Dennis Gove
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ab87a0e ]

        SOLR-8988: Adds query option facet.distrib.mco which when set to true allows the use of facet.mincount=1 in cloud mode

        Show
        jira-bot ASF subversion and git services added a comment - Commit ab87a0e75641d3e4076b9f4c247339f9d9c47103 in lucene-solr's branch refs/heads/branch_6x from Dennis Gove [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ab87a0e ] SOLR-8988 : Adds query option facet.distrib.mco which when set to true allows the use of facet.mincount=1 in cloud mode
        Hide
        dpgove Dennis Gove added a comment -

        Patch as been applied to master and 6x for release in 6.1. Note that I have confirmed the patch does also apply to 5x if you should want to use this in a later Solr 5.

        Show
        dpgove Dennis Gove added a comment - Patch as been applied to master and 6x for release in 6.1. Note that I have confirmed the patch does also apply to 5x if you should want to use this in a later Solr 5.
        Hide
        dsmiley David Smiley added a comment -

        BTW references to SolrCloud or "cloud mode" here seem incorrect, right? This is about distributed (aka "sharded") faceting. I was confused by the title and a related issue mentioning SolrCloud and I wondered how on earth SolrCloud would affect faceting.

        Show
        dsmiley David Smiley added a comment - BTW references to SolrCloud or "cloud mode" here seem incorrect, right? This is about distributed (aka "sharded") faceting. I was confused by the title and a related issue mentioning SolrCloud and I wondered how on earth SolrCloud would affect faceting.
        Hide
        k317h Keith Laban added a comment -

        Thats right. This affects all queries where isDistrib is true for any reason.

        Show
        k317h Keith Laban added a comment - Thats right. This affects all queries where isDistrib is true for any reason.

          People

          • Assignee:
            dpgove Dennis Gove
            Reporter:
            k317h Keith Laban
          • Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development