Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-10918

StatsComponent cardinality descrepencies between regular vs pre-hashed values whe using PointsField

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Security Level: Public (Default Security Level. Issues are Public)

      Description

      discovered as part of SOLR-10807...

      when using Points based numerics, the HLL estimates using the raw values vs the hashed values disagree slightly – this suggests some possible bug (or the very least: room for optimization) when using Points fields.

      Example from SOLR-10807 when swaping IntPointField in place of TrieIntField...

         [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestDistributedStatsComponentCardinality -Dtests.method=test -Dtests.seed=63854996088ED7B7 -Dtests.slow=true -Dtests.locale=de-GR -Dtests.timezone=Etc/UCT -Dtests.asserts=true -Dtests.file.encoding=ISO-8859-1
         [junit4] FAILURE 13.3s J2 | TestDistributedStatsComponentCardinality.test <<<
         [junit4]    > Throwable #1: java.lang.AssertionError: int_i: hashed vs prehashed, real=7260, p=q=id:[1186+TO+8445]&rows=0&stats=true&stats.field={!cardinality%3Dtrue+hllLog2m%3D7+hllRegwidth%3D8}int_i&stats.field={!cardinality%3Dtrue+hllLog2m%3D7+hllRegwidth%3D8+hllPreHashed%3Dtrue}int_i_prehashed_l&stats.field={!cardinality%3Dtrue+hllLog2m%3D7+hllRegwidth%3D8}long_l&stats.field={!cardinality%3Dtrue+hllLog2m%3D7+hllRegwidth%3D8+hllPreHashed%3Dtrue}long_l_prehashed_l&stats.field={!cardinality%3Dtrue+hllLog2m%3D7+hllRegwidth%3D8}string_s&stats.field={!cardinality%3Dtrue+hllLog2m%3D7+hllRegwidth%3D8+hllPreHashed%3Dtrue}string_s_prehashed_l expected:<6632> but was:<7929>
         [junit4]    >        at __randomizedtesting.SeedInfo.seed([63854996088ED7B7:EBD1764CA672BA4F]:0)
         [junit4]    >        at org.apache.solr.handler.component.TestDistributedStatsComponentCardinality.test(TestDistributedStatsComponentCardinality.java:149)
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                hossman Hoss Man
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated: