Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-7631

RandomCodec can cause Faceting on multivalued Trie fields with precisionStep != 0 can produce bogus value="0" in some test seeds

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.3, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      Working through SOLR-7605, I've confirmed that the underlying problem exists for regular field.facet situations, regardless of distrib mode, for Trie fields that have a non-zero precisionStep. this has only been reproduced when the RandomCodec was in use

      The problem, when it manifests, is that faceting on a TrieIntField, using facet.mincount=0, causes the facet results to include three instances of facet the value "0" listed with a count of "0" – even though no document in the index contains this value at all...

         [junit4]    >   <lst name="facet_fields">
         [junit4]    >     <lst name="foo_ti">
         [junit4]    >       <int name="20">32</int>
      ...
         [junit4]    >       <int name="50">21</int>
         [junit4]    >       <int name="0">0</int>
         [junit4]    >       <int name="0">0</int>
         [junit4]    >       <int name="0">0</int>
      

      This is concerning for a few reasons:

      • In the case of PivotFaceting, getting duplicate values back from a single shard like this triggers an assert in distributed queries and the request fails – even if asserts aren't enabled, the bogus "0" value can be propogated to clients if they ask for facet.pivot.mincount=0
      • Client code expecting a single (value,count) pair for each value may equally be confused/broken by this response where the same "value" is returned multiple times
      • w/o knowing the root cause, It seems very possible that other nonsense values may be getting returned – ie: if the error only happens with fields utilizing precisionStep, then it's likely related to the synthetic values used for faster range queries, and other synthetic values may be getting included with bogus counts

      A Patch with a simple test that can demonstrate the bug fairly easily will be attached shortly

        Attachments

        1. log.tgz
          1.63 MB
          Chris M. Hostetter
        2. SOLR-7631_test.patch
          8 kB
          Chris M. Hostetter
        3. SOLR-7631_test.patch
          5 kB
          Chris M. Hostetter

          Issue Links

            Activity

              People

              • Assignee:
                hossman Chris M. Hostetter
                Reporter:
                hossman Chris M. Hostetter
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: