Lucene - Core
  1. Lucene - Core
  2. LUCENE-4951

DrillSidewaysScorer should use Scorer.cost instead of its own heuristic

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.4, 5.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Today it does the "first docID" trick to guess the cost of the
      baseQuery, which is silly now that we have cost API.

      1. LUCENE-4951.patch
        2 kB
        Michael McCandless

        Activity

        Michael McCandless created issue -
        Michael McCandless made changes -
        Field Original Value New Value
        Assignee Michael McCandless [ mikemccand ]
        Hide
        Michael McCandless added a comment -

        Patch.

        Show
        Michael McCandless added a comment - Patch.
        Michael McCandless made changes -
        Attachment LUCENE-4951.patch [ 12580027 ]
        Hide
        Michael McCandless added a comment -
                            Task    QPS base      StdDev    QPS comp      StdDev                Pct diff
               HighTermHardOrDD1        2.54      (2.6%)        1.25      (2.4%)  -50.9% ( -54% -  -47%)
               HighTermEasyOrDD2       18.02      (1.0%)        9.76      (0.8%)  -45.9% ( -47% -  -44%)
               HighTermHardOrDD2        2.42      (2.6%)        1.32      (1.6%)  -45.4% ( -48% -  -42%)
                 HighTermHardDD1        3.13      (2.4%)        1.85      (1.7%)  -40.8% ( -43% -  -37%)
                MedTermEasyOrDD2       37.69      (1.6%)       22.31      (1.4%)  -40.8% ( -43% -  -38%)
                MedTermEasyOrDD1       29.78      (1.9%)       18.27      (1.4%)  -38.7% ( -41% -  -36%)
                 HighTermEasyDD1        5.86      (2.5%)        3.73      (3.0%)  -36.3% ( -40% -  -31%)
               HighTermEasyOrDD1        6.37      (2.5%)        4.06      (1.5%)  -36.2% ( -39% -  -33%)
                  MedTermEasyDD1       32.05      (1.9%)       21.38      (2.7%)  -33.3% ( -37% -  -29%)
                MedTermHardOrDD1       11.73      (2.9%)        7.84      (2.7%)  -33.2% ( -37% -  -28%)
                HighTermMixedDD2        4.50      (2.4%)        3.02      (1.3%)  -32.9% ( -35% -  -30%)
                 HighTermHardDD2        1.98      (2.6%)        1.34      (1.5%)  -32.5% ( -35% -  -29%)
                LowTermEasyOrDD1       90.29      (0.9%)       60.98      (1.6%)  -32.5% ( -34% -  -30%)
                MedTermHardOrDD2        7.19      (2.9%)        5.05      (1.7%)  -29.8% ( -33% -  -25%)
                 HighTermEasyDD2       27.17      (0.9%)       19.70      (0.9%)  -27.5% ( -29% -  -25%)
                  LowTermEasyDD1      101.22      (0.8%)       80.80      (1.6%)  -20.2% ( -22% -  -17%)
                  MedTermHardDD1       12.26      (2.8%)        9.82      (2.0%)  -19.9% ( -24% -  -15%)
                  MedTermEasyDD2       55.60      (1.4%)       45.55      (1.2%)  -18.1% ( -20% -  -15%)
                  LowTermHardDD2       29.13      (0.7%)       25.56      (1.2%)  -12.3% ( -14% -  -10%)
                  MedTermHardDD2        7.21      (3.0%)        6.47      (1.7%)  -10.2% ( -14% -   -5%)
                 MedTermMixedDD2       14.49      (4.4%)       13.52      (1.5%)   -6.7% ( -12% -    0%)
                  LowTermEasyDD2      145.99      (3.3%)      148.04      (1.8%)    1.4% (  -3% -    6%)
                LowTermEasyOrDD2       65.37      (2.9%)       69.22      (2.1%)    5.9% (   0% -   11%)
                LowTermHardOrDD1       20.85      (4.8%)       27.88      (2.9%)   33.7% (  24% -   43%)
                  LowTermHardDD1       22.05      (4.8%)       33.55      (2.0%)   52.2% (  43% -   61%)
                LowTermHardOrDD2       12.70      (5.3%)       19.86      (2.2%)   56.3% (  46% -   67%)
                 LowTermMixedDD2       23.87      (6.5%)       47.12      (3.0%)   97.4% (  82% -  114%)
        

        After the patch:

                            Task    QPS base      StdDev    QPS comp      StdDev                Pct diff
               HighTermHardOrDD1        2.56      (3.1%)        1.25      (1.9%)  -51.1% ( -54% -  -47%)
                MedTermEasyOrDD2       45.66      (3.4%)       22.35      (0.9%)  -51.0% ( -53% -  -48%)
               HighTermHardOrDD2        2.43      (3.1%)        1.31      (1.8%)  -46.0% ( -49% -  -42%)
                MedTermHardOrDD1       14.41      (2.2%)        7.86      (1.7%)  -45.4% ( -48% -  -42%)
                MedTermHardOrDD2        8.63      (1.8%)        5.04      (1.5%)  -41.6% ( -44% -  -39%)
                 HighTermHardDD1        3.15      (3.0%)        1.86      (1.2%)  -40.8% ( -43% -  -37%)
                MedTermEasyOrDD1       30.07      (2.7%)       18.28      (1.2%)  -39.2% ( -41% -  -36%)
               HighTermEasyOrDD1        6.46      (3.3%)        4.07      (1.4%)  -37.1% ( -40% -  -33%)
                 HighTermEasyDD1        5.95      (3.5%)        3.76      (1.3%)  -36.9% ( -40% -  -33%)
                  MedTermHardDD1       15.24      (2.1%)        9.89      (1.2%)  -35.1% ( -37% -  -32%)
                HighTermMixedDD2        4.57      (3.1%)        3.02      (1.0%)  -34.0% ( -36% -  -30%)
                  MedTermEasyDD1       32.38      (2.8%)       21.51      (1.2%)  -33.6% ( -36% -  -30%)
                 HighTermHardDD2        1.99      (3.0%)        1.34      (1.3%)  -32.8% ( -36% -  -29%)
                LowTermEasyOrDD1       90.54      (1.1%)       61.22      (1.1%)  -32.4% ( -34% -  -30%)
                LowTermEasyOrDD2      101.27      (4.1%)       69.42      (1.1%)  -31.5% ( -35% -  -27%)
               HighTermEasyOrDD2       14.14      (4.5%)        9.70      (1.4%)  -31.4% ( -35% -  -26%)
                LowTermHardOrDD1       38.76      (1.2%)       27.93      (1.2%)  -28.0% ( -29% -  -25%)
                 MedTermMixedDD2       18.71      (1.3%)       13.52      (1.0%)  -27.7% ( -29% -  -25%)
                  MedTermHardDD2        8.84      (1.9%)        6.48      (1.2%)  -26.6% ( -29% -  -23%)
                  MedTermEasyDD2       61.20      (4.3%)       45.55      (1.2%)  -25.6% ( -29% -  -21%)
                LowTermHardOrDD2       26.55      (0.9%)       19.86      (1.3%)  -25.2% ( -27% -  -23%)
                  LowTermEasyDD2      187.68      (1.2%)      148.29      (1.3%)  -21.0% ( -23% -  -18%)
                  LowTermEasyDD1      101.97      (1.5%)       80.92      (0.7%)  -20.6% ( -22% -  -18%)
                  LowTermHardDD1       41.60      (0.8%)       33.70      (1.1%)  -19.0% ( -20% -  -17%)
                  LowTermHardDD2       29.39      (1.0%)       25.58      (1.2%)  -13.0% ( -15% -  -10%)
                 LowTermMixedDD2       53.89      (0.9%)       47.31      (1.2%)  -12.2% ( -14% -  -10%)
                 HighTermEasyDD2       18.77      (5.3%)       19.72      (1.3%)    5.0% (  -1% -   12%)
        

        Ie, using cost API lets DrillSidewaysScorer do a better job picking
        which scorer impl to use ...

        Show
        Michael McCandless added a comment - Task QPS base StdDev QPS comp StdDev Pct diff HighTermHardOrDD1 2.54 (2.6%) 1.25 (2.4%) -50.9% ( -54% - -47%) HighTermEasyOrDD2 18.02 (1.0%) 9.76 (0.8%) -45.9% ( -47% - -44%) HighTermHardOrDD2 2.42 (2.6%) 1.32 (1.6%) -45.4% ( -48% - -42%) HighTermHardDD1 3.13 (2.4%) 1.85 (1.7%) -40.8% ( -43% - -37%) MedTermEasyOrDD2 37.69 (1.6%) 22.31 (1.4%) -40.8% ( -43% - -38%) MedTermEasyOrDD1 29.78 (1.9%) 18.27 (1.4%) -38.7% ( -41% - -36%) HighTermEasyDD1 5.86 (2.5%) 3.73 (3.0%) -36.3% ( -40% - -31%) HighTermEasyOrDD1 6.37 (2.5%) 4.06 (1.5%) -36.2% ( -39% - -33%) MedTermEasyDD1 32.05 (1.9%) 21.38 (2.7%) -33.3% ( -37% - -29%) MedTermHardOrDD1 11.73 (2.9%) 7.84 (2.7%) -33.2% ( -37% - -28%) HighTermMixedDD2 4.50 (2.4%) 3.02 (1.3%) -32.9% ( -35% - -30%) HighTermHardDD2 1.98 (2.6%) 1.34 (1.5%) -32.5% ( -35% - -29%) LowTermEasyOrDD1 90.29 (0.9%) 60.98 (1.6%) -32.5% ( -34% - -30%) MedTermHardOrDD2 7.19 (2.9%) 5.05 (1.7%) -29.8% ( -33% - -25%) HighTermEasyDD2 27.17 (0.9%) 19.70 (0.9%) -27.5% ( -29% - -25%) LowTermEasyDD1 101.22 (0.8%) 80.80 (1.6%) -20.2% ( -22% - -17%) MedTermHardDD1 12.26 (2.8%) 9.82 (2.0%) -19.9% ( -24% - -15%) MedTermEasyDD2 55.60 (1.4%) 45.55 (1.2%) -18.1% ( -20% - -15%) LowTermHardDD2 29.13 (0.7%) 25.56 (1.2%) -12.3% ( -14% - -10%) MedTermHardDD2 7.21 (3.0%) 6.47 (1.7%) -10.2% ( -14% - -5%) MedTermMixedDD2 14.49 (4.4%) 13.52 (1.5%) -6.7% ( -12% - 0%) LowTermEasyDD2 145.99 (3.3%) 148.04 (1.8%) 1.4% ( -3% - 6%) LowTermEasyOrDD2 65.37 (2.9%) 69.22 (2.1%) 5.9% ( 0% - 11%) LowTermHardOrDD1 20.85 (4.8%) 27.88 (2.9%) 33.7% ( 24% - 43%) LowTermHardDD1 22.05 (4.8%) 33.55 (2.0%) 52.2% ( 43% - 61%) LowTermHardOrDD2 12.70 (5.3%) 19.86 (2.2%) 56.3% ( 46% - 67%) LowTermMixedDD2 23.87 (6.5%) 47.12 (3.0%) 97.4% ( 82% - 114%) After the patch: Task QPS base StdDev QPS comp StdDev Pct diff HighTermHardOrDD1 2.56 (3.1%) 1.25 (1.9%) -51.1% ( -54% - -47%) MedTermEasyOrDD2 45.66 (3.4%) 22.35 (0.9%) -51.0% ( -53% - -48%) HighTermHardOrDD2 2.43 (3.1%) 1.31 (1.8%) -46.0% ( -49% - -42%) MedTermHardOrDD1 14.41 (2.2%) 7.86 (1.7%) -45.4% ( -48% - -42%) MedTermHardOrDD2 8.63 (1.8%) 5.04 (1.5%) -41.6% ( -44% - -39%) HighTermHardDD1 3.15 (3.0%) 1.86 (1.2%) -40.8% ( -43% - -37%) MedTermEasyOrDD1 30.07 (2.7%) 18.28 (1.2%) -39.2% ( -41% - -36%) HighTermEasyOrDD1 6.46 (3.3%) 4.07 (1.4%) -37.1% ( -40% - -33%) HighTermEasyDD1 5.95 (3.5%) 3.76 (1.3%) -36.9% ( -40% - -33%) MedTermHardDD1 15.24 (2.1%) 9.89 (1.2%) -35.1% ( -37% - -32%) HighTermMixedDD2 4.57 (3.1%) 3.02 (1.0%) -34.0% ( -36% - -30%) MedTermEasyDD1 32.38 (2.8%) 21.51 (1.2%) -33.6% ( -36% - -30%) HighTermHardDD2 1.99 (3.0%) 1.34 (1.3%) -32.8% ( -36% - -29%) LowTermEasyOrDD1 90.54 (1.1%) 61.22 (1.1%) -32.4% ( -34% - -30%) LowTermEasyOrDD2 101.27 (4.1%) 69.42 (1.1%) -31.5% ( -35% - -27%) HighTermEasyOrDD2 14.14 (4.5%) 9.70 (1.4%) -31.4% ( -35% - -26%) LowTermHardOrDD1 38.76 (1.2%) 27.93 (1.2%) -28.0% ( -29% - -25%) MedTermMixedDD2 18.71 (1.3%) 13.52 (1.0%) -27.7% ( -29% - -25%) MedTermHardDD2 8.84 (1.9%) 6.48 (1.2%) -26.6% ( -29% - -23%) MedTermEasyDD2 61.20 (4.3%) 45.55 (1.2%) -25.6% ( -29% - -21%) LowTermHardOrDD2 26.55 (0.9%) 19.86 (1.3%) -25.2% ( -27% - -23%) LowTermEasyDD2 187.68 (1.2%) 148.29 (1.3%) -21.0% ( -23% - -18%) LowTermEasyDD1 101.97 (1.5%) 80.92 (0.7%) -20.6% ( -22% - -18%) LowTermHardDD1 41.60 (0.8%) 33.70 (1.1%) -19.0% ( -20% - -17%) LowTermHardDD2 29.39 (1.0%) 25.58 (1.2%) -13.0% ( -15% - -10%) LowTermMixedDD2 53.89 (0.9%) 47.31 (1.2%) -12.2% ( -14% - -10%) HighTermEasyDD2 18.77 (5.3%) 19.72 (1.3%) 5.0% ( -1% - 12%) Ie, using cost API lets DrillSidewaysScorer do a better job picking which scorer impl to use ...
        Hide
        Commit Tag Bot added a comment -

        [trunk commit] mikemccand
        http://svn.apache.org/viewvc?view=revision&revision=1471705

        LUCENE-4951: DrillSideways now uses Scorer.cost() to decide which scorer impl to use

        Show
        Commit Tag Bot added a comment - [trunk commit] mikemccand http://svn.apache.org/viewvc?view=revision&revision=1471705 LUCENE-4951 : DrillSideways now uses Scorer.cost() to decide which scorer impl to use
        Hide
        Commit Tag Bot added a comment -

        [branch_4x commit] mikemccand
        http://svn.apache.org/viewvc?view=revision&revision=1471707

        LUCENE-4951: DrillSideways now uses Scorer.cost() to decide which scorer impl to use

        Show
        Commit Tag Bot added a comment - [branch_4x commit] mikemccand http://svn.apache.org/viewvc?view=revision&revision=1471707 LUCENE-4951 : DrillSideways now uses Scorer.cost() to decide which scorer impl to use
        Michael McCandless made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Commit Tag Bot added a comment -

        [trunk commit] mikemccand
        http://svn.apache.org/viewvc?view=revision&revision=1471738

        LUCENE-4951: cutover another freq -> cost

        Show
        Commit Tag Bot added a comment - [trunk commit] mikemccand http://svn.apache.org/viewvc?view=revision&revision=1471738 LUCENE-4951 : cutover another freq -> cost
        Hide
        Commit Tag Bot added a comment -

        [branch_4x commit] mikemccand
        http://svn.apache.org/viewvc?view=revision&revision=1471741

        LUCENE-4951: cutover another freq -> cost

        Show
        Commit Tag Bot added a comment - [branch_4x commit] mikemccand http://svn.apache.org/viewvc?view=revision&revision=1471741 LUCENE-4951 : cutover another freq -> cost
        Hide
        Steve Rowe added a comment -

        Bulk close resolved 4.4 issues

        Show
        Steve Rowe added a comment - Bulk close resolved 4.4 issues
        Steve Rowe made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Michael McCandless
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development