Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-11729

Increase default overrequest ratio/count in json.facet to match existing defaults for facet.overrequest.ratio & facet.overrequest.count ?

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Invalid
    • None
    • None
    • None
    • None

    Description

      When FacetComponent first got support for distributed search, the default "effective shard limit" done on shards followed the formula...

      limit = (int)(dff.initialLimit * 1.5) + 10;
      

      ...over time, this became configurable with the introduction of some expert level tuning options: facet.overrequest.ratio & facet.overrequest.count – but the defaults (and basic formula) remain the same to this day...

            this.overrequestRatio
              = params.getFieldDouble(field, FacetParams.FACET_OVERREQUEST_RATIO, 1.5);
            this.overrequestCount 
              = params.getFieldInt(field, FacetParams.FACET_OVERREQUEST_COUNT, 10);
      ...
        private int doOverRequestMath(int limit, double ratio, int count) {
          // NOTE: normally, "1.0F < ratio"
          //
          // if the user chooses a ratio < 1, we allow it and don't "bottom out" at
          // the original limit until *after* we've also added the count.
          int adjustedLimit = (int) (limit * ratio) + count;
          return Math.max(limit, adjustedLimit);
        }
      

      However...

      When json.facet multi-shard refinement was added, the code was written slightly diff:

      • there is an explicit overrequest:N (count) option
      • if -1 == overrequest (which is the default) then an "effective shard limit" is computed using the same basic formula as in FacetComponet – but the constants are different...
        • effectiveLimit = (long) (effectiveLimit * 1.1 + 4);
      • For any (non "-1") user specified overrequest value, it's added verbatim to the limit (which may have been user specified, or may just be the default)
        • effectiveLimit += freq.overrequest;

      Given the design of the json.facet syntax, I can understand why the code path for an "advanced" user specified overrequest:N option avoids using any (implicit) ratio calculation and just does the straight addition of limit += overrequest.

      What I'm not clear on is the choice of the constants 1.1 and 4 in the common (default) case, and why those differ from the historically used 1.5 and 10.


      It may seem like a small thing to worry about, but it can/will cause odd inconsistencies when people try to migrate simple facet.field=foo (or facet.pivot=foo,bar) queries to json.facet – I have also seen it give people attempting these types of migrations the (mistaken) impression that discrepancies they are seeing are because refine:true is not be working.

      For this reason, I propose we change the (default) overrequest:-1 behavior to use the same constants as the equivilent FacetComponent code...

      if (fcontext.isShard()) {
        if (freq.overrequest == -1) {
          // add over-request if this is a shard request and if we have a small offset (large offsets will already be gathering many more buckets than needed)
          if (freq.offset < 10) {
            effectiveLimit = (long) (effectiveLimit * 1.5 + 10);
          }
          ...
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              hossman Chris M. Hostetter
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: