Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7839

Optimize the default NormsFormat for the case that all norms are in 0..16

Details

    • Task
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • None
    • None
    • New

    Description

      Given how we now store the length of the field in norms, we could optimize the default norms format for the case that all norms are in 0..16 and store it on 4 bits. This would be picked up for short fields that have less than 16 terms (eg. title fields) and reduce disk utilization by 2.

      Attachments

        1. LUCENE-7839.patch
          13 kB
          Adrien Grand

        Activity

          rcmuir Robert Muir added a comment -

          Should this really be necessary given an iterator API? It seems too highly specialized and fragile (just like what random access APIs were doing), versus using a e.g. block-level compression like the posting lists.

          rcmuir Robert Muir added a comment - Should this really be necessary given an iterator API? It seems too highly specialized and fragile (just like what random access APIs were doing), versus using a e.g. block-level compression like the posting lists.
          jpountz Adrien Grand added a comment -

          I tried to leverage the iterator API similarly to what numeric doc values do, but luceneutil seems to notice a performance hit:

                              TaskQPS baseline      StdDev   QPS patch      StdDev                Pct diff
                          HighTerm      569.71     (11.5%)      490.35      (9.0%)  -13.9% ( -30% -    7%)
                        OrHighHigh      138.08     (11.6%)      123.27      (7.1%)  -10.7% ( -26% -    9%)
                         OrHighMed      295.37     (11.2%)      269.99      (8.1%)   -8.6% ( -25% -   12%)
                         OrHighLow      379.17      (9.1%)      351.63      (6.4%)   -7.3% ( -20% -    9%)
                           MedTerm     1518.29     (11.9%)     1421.77      (6.8%)   -6.4% ( -22% -   14%)
                       AndHighHigh      386.22      (9.3%)      367.76      (9.0%)   -4.8% ( -21% -   14%)
                           LowTerm     3236.73      (8.3%)     3118.34      (8.3%)   -3.7% ( -18% -   14%)
                   MedSloppyPhrase      555.94      (9.6%)      537.02      (6.3%)   -3.4% ( -17% -   13%)
             HighTermDayOfYearSort      330.62     (12.2%)      320.20      (9.8%)   -3.2% ( -22% -   21%)
                         MedPhrase      635.77      (9.6%)      616.12      (8.1%)   -3.1% ( -18% -   16%)
                  HighSloppyPhrase      147.02      (8.6%)      142.77      (7.9%)   -2.9% ( -17% -   14%)
                            IntNRQ      117.56      (9.8%)      114.43     (10.2%)   -2.7% ( -20% -   19%)
                      HighSpanNear       57.73      (7.9%)       56.21      (7.4%)   -2.6% ( -16% -   13%)
                   LowSloppyPhrase      385.52      (8.9%)      375.39      (6.5%)   -2.6% ( -16% -   13%)
                         LowPhrase      653.67      (9.7%)      637.17      (7.4%)   -2.5% ( -17% -   16%)
                           Prefix3      287.63     (12.3%)      281.78     (10.3%)   -2.0% ( -21% -   23%)
                           Respell      144.41      (7.8%)      141.67      (6.7%)   -1.9% ( -15% -   13%)
                        AndHighMed      676.46      (8.3%)      665.05      (9.8%)   -1.7% ( -18% -   17%)
                          Wildcard      214.90      (8.5%)      211.57      (7.0%)   -1.5% ( -15% -   15%)
                        HighPhrase       20.11      (9.7%)       20.03      (8.5%)   -0.4% ( -17% -   19%)
                       MedSpanNear      476.40      (8.7%)      476.48      (7.7%)    0.0% ( -15% -   18%)
                        AndHighLow      964.81      (9.8%)      965.18      (8.0%)    0.0% ( -16% -   19%)
                 HighTermMonthSort     1190.72      (9.6%)     1194.44     (11.4%)    0.3% ( -18% -   23%)
                       LowSpanNear      421.27      (7.8%)      423.97      (9.9%)    0.6% ( -15% -   19%)
                            Fuzzy2       49.17     (16.2%)       50.09     (19.1%)    1.9% ( -28% -   44%)
                            Fuzzy1      129.89     (12.6%)      132.32     (11.9%)    1.9% ( -20% -   30%)
          

          You can find the patch that I played with attached. It keeps the current levels of compression, but just splits values into blocks of 2^14 values and decides on the number of bits on a per-block basis. Maybe there is a better way to do this...

          jpountz Adrien Grand added a comment - I tried to leverage the iterator API similarly to what numeric doc values do, but luceneutil seems to notice a performance hit: TaskQPS baseline StdDev QPS patch StdDev Pct diff HighTerm 569.71 (11.5%) 490.35 (9.0%) -13.9% ( -30% - 7%) OrHighHigh 138.08 (11.6%) 123.27 (7.1%) -10.7% ( -26% - 9%) OrHighMed 295.37 (11.2%) 269.99 (8.1%) -8.6% ( -25% - 12%) OrHighLow 379.17 (9.1%) 351.63 (6.4%) -7.3% ( -20% - 9%) MedTerm 1518.29 (11.9%) 1421.77 (6.8%) -6.4% ( -22% - 14%) AndHighHigh 386.22 (9.3%) 367.76 (9.0%) -4.8% ( -21% - 14%) LowTerm 3236.73 (8.3%) 3118.34 (8.3%) -3.7% ( -18% - 14%) MedSloppyPhrase 555.94 (9.6%) 537.02 (6.3%) -3.4% ( -17% - 13%) HighTermDayOfYearSort 330.62 (12.2%) 320.20 (9.8%) -3.2% ( -22% - 21%) MedPhrase 635.77 (9.6%) 616.12 (8.1%) -3.1% ( -18% - 16%) HighSloppyPhrase 147.02 (8.6%) 142.77 (7.9%) -2.9% ( -17% - 14%) IntNRQ 117.56 (9.8%) 114.43 (10.2%) -2.7% ( -20% - 19%) HighSpanNear 57.73 (7.9%) 56.21 (7.4%) -2.6% ( -16% - 13%) LowSloppyPhrase 385.52 (8.9%) 375.39 (6.5%) -2.6% ( -16% - 13%) LowPhrase 653.67 (9.7%) 637.17 (7.4%) -2.5% ( -17% - 16%) Prefix3 287.63 (12.3%) 281.78 (10.3%) -2.0% ( -21% - 23%) Respell 144.41 (7.8%) 141.67 (6.7%) -1.9% ( -15% - 13%) AndHighMed 676.46 (8.3%) 665.05 (9.8%) -1.7% ( -18% - 17%) Wildcard 214.90 (8.5%) 211.57 (7.0%) -1.5% ( -15% - 15%) HighPhrase 20.11 (9.7%) 20.03 (8.5%) -0.4% ( -17% - 19%) MedSpanNear 476.40 (8.7%) 476.48 (7.7%) 0.0% ( -15% - 18%) AndHighLow 964.81 (9.8%) 965.18 (8.0%) 0.0% ( -16% - 19%) HighTermMonthSort 1190.72 (9.6%) 1194.44 (11.4%) 0.3% ( -18% - 23%) LowSpanNear 421.27 (7.8%) 423.97 (9.9%) 0.6% ( -15% - 19%) Fuzzy2 49.17 (16.2%) 50.09 (19.1%) 1.9% ( -28% - 44%) Fuzzy1 129.89 (12.6%) 132.32 (11.9%) 1.9% ( -20% - 30%) You can find the patch that I played with attached. It keeps the current levels of compression, but just splits values into blocks of 2^14 values and decides on the number of bits on a per-block basis. Maybe there is a better way to do this...
          rcmuir Robert Muir added a comment -

          OK, i will try to look at the patch. I'm not sure its necessary to do delta compression (max-min) in this case, in case it makes things simpler.

          rcmuir Robert Muir added a comment - OK, i will try to look at the patch. I'm not sure its necessary to do delta compression (max-min) in this case, in case it makes things simpler.
          jpountz Adrien Grand added a comment -

          Agreed. At the moment the patch does not do delta compression, it just reads plain byte/short/int/long values like master. The only difference with master is that it splits values into blocks of 16384 and encodes each block independently. It does not even specialize the case that norms are in 0..15 since I first wanted to get an idea of the performance impact of leveraging the iterator API so that a single outlier does not raise the number of bits per value for document.

          jpountz Adrien Grand added a comment - Agreed. At the moment the patch does not do delta compression, it just reads plain byte/short/int/long values like master. The only difference with master is that it splits values into blocks of 16384 and encodes each block independently. It does not even specialize the case that norms are in 0..15 since I first wanted to get an idea of the performance impact of leveraging the iterator API so that a single outlier does not raise the number of bits per value for document.
          rcmuir Robert Muir added a comment -

          Right, but this is still old-style in the sense that it could still be done with the random access API with shift+mask ... and this block compression has been done before and always gave overhead.

          I was suggesting much more like postings to eliminate the readByte()/readByte()/readByte() stuff for the worst case.

          rcmuir Robert Muir added a comment - Right, but this is still old-style in the sense that it could still be done with the random access API with shift+mask ... and this block compression has been done before and always gave overhead. I was suggesting much more like postings to eliminate the readByte()/readByte()/readByte() stuff for the worst case.
          tomoko Tomoko Uchida added a comment -

          This issue was moved to GitHub issue: #8890.

          tomoko Tomoko Uchida added a comment - This issue was moved to GitHub issue: #8890 .

          People

            Unassigned Unassigned
            jpountz Adrien Grand
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: