Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.3, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      We added this api in LUCENE-5729 but we never explored implementing norms with it. These are generally the largest consumer of heap memory and often a real hassle for users.

        Activity

        Hide
        Robert Muir added a comment -

        Here is a patch. We just use these methods to get signed values as a very simplistic compression.

        Generally speaking the performance seems ok, I think its the right tradeoff.

        Report after iter 19:
        Chart saved to out.png... (wd: /home/rmuir/workspace/util/src/python)
                            Task   QPS trunk      StdDev   QPS patch      StdDev                Pct diff
                         LowTerm      983.09      (3.8%)      903.02      (4.6%)   -8.1% ( -15% -    0%)
                    HighSpanNear      158.40      (2.7%)      148.01      (2.0%)   -6.6% ( -11% -   -1%)
                    OrNotHighMed      213.20      (3.9%)      199.50      (2.8%)   -6.4% ( -12% -    0%)
                    OrNotHighLow     1170.07      (2.7%)     1134.55      (2.4%)   -3.0% (  -7% -    2%)
                     AndHighHigh       87.91      (1.9%)       86.21      (1.8%)   -1.9% (  -5% -    1%)
                          IntNRQ        8.64      (5.6%)        8.48      (8.0%)   -1.8% ( -14% -   12%)
                      AndHighMed      123.04      (1.8%)      120.85      (1.7%)   -1.8% (  -5% -    1%)
                          Fuzzy2       60.37      (1.3%)       59.35      (1.8%)   -1.7% (  -4% -    1%)
                        Wildcard       44.77      (3.2%)       44.06      (4.6%)   -1.6% (  -9% -    6%)
                     MedSpanNear      150.07      (3.1%)      148.15      (2.6%)   -1.3% (  -6% -    4%)
                     LowSpanNear       30.53      (1.2%)       30.15      (1.5%)   -1.2% (  -3% -    1%)
                       LowPhrase       33.89      (2.1%)       33.49      (3.7%)   -1.2% (  -6% -    4%)
                         Prefix3      210.10      (3.8%)      207.61      (5.3%)   -1.2% (  -9% -    8%)
                      AndHighLow     1180.40      (2.1%)     1166.82      (2.5%)   -1.2% (  -5% -    3%)
                         Respell       81.67      (1.5%)       81.41      (2.4%)   -0.3% (  -4% -    3%)
                          Fuzzy1       97.84      (1.3%)       97.64      (1.7%)   -0.2% (  -3% -    2%)
                 LowSloppyPhrase      120.00      (3.0%)      120.42      (3.0%)    0.4% (  -5% -    6%)
                       MedPhrase      263.96      (4.8%)      265.05      (7.1%)    0.4% ( -10% -   12%)
                 MedSloppyPhrase       15.97      (4.5%)       16.08      (4.6%)    0.7% (  -8% -   10%)
                         MedTerm      175.47      (4.4%)      177.64      (5.7%)    1.2% (  -8% -   11%)
                      HighPhrase       17.80      (6.3%)       18.20      (9.3%)    2.2% ( -12% -   19%)
                       OrHighMed       54.02      (6.8%)       56.18      (7.7%)    4.0% (  -9% -   19%)
                       OrHighLow       52.35      (6.9%)       54.61      (7.7%)    4.3% (  -9% -   20%)
                HighSloppyPhrase       12.85     (10.3%)       13.41     (11.9%)    4.4% ( -16% -   29%)
                      OrHighHigh       25.34      (7.2%)       26.77      (8.3%)    5.6% (  -9% -   22%)
                        HighTerm      119.17      (4.7%)      128.45      (7.1%)    7.8% (  -3% -   20%)
                    OrHighNotLow      110.06      (6.4%)      119.06      (7.0%)    8.2% (  -4% -   23%)
                    OrHighNotMed       91.03      (6.1%)       98.91      (6.4%)    8.7% (  -3% -   22%)
                   OrNotHighHigh       51.53      (5.6%)       56.04      (6.4%)    8.8% (  -3% -   21%)
                   OrHighNotHigh       30.45      (5.7%)       33.27      (6.2%)    9.2% (  -2% -   22%)
        
        Show
        Robert Muir added a comment - Here is a patch. We just use these methods to get signed values as a very simplistic compression. Generally speaking the performance seems ok, I think its the right tradeoff. Report after iter 19: Chart saved to out.png... (wd: /home/rmuir/workspace/util/src/python) Task QPS trunk StdDev QPS patch StdDev Pct diff LowTerm 983.09 (3.8%) 903.02 (4.6%) -8.1% ( -15% - 0%) HighSpanNear 158.40 (2.7%) 148.01 (2.0%) -6.6% ( -11% - -1%) OrNotHighMed 213.20 (3.9%) 199.50 (2.8%) -6.4% ( -12% - 0%) OrNotHighLow 1170.07 (2.7%) 1134.55 (2.4%) -3.0% ( -7% - 2%) AndHighHigh 87.91 (1.9%) 86.21 (1.8%) -1.9% ( -5% - 1%) IntNRQ 8.64 (5.6%) 8.48 (8.0%) -1.8% ( -14% - 12%) AndHighMed 123.04 (1.8%) 120.85 (1.7%) -1.8% ( -5% - 1%) Fuzzy2 60.37 (1.3%) 59.35 (1.8%) -1.7% ( -4% - 1%) Wildcard 44.77 (3.2%) 44.06 (4.6%) -1.6% ( -9% - 6%) MedSpanNear 150.07 (3.1%) 148.15 (2.6%) -1.3% ( -6% - 4%) LowSpanNear 30.53 (1.2%) 30.15 (1.5%) -1.2% ( -3% - 1%) LowPhrase 33.89 (2.1%) 33.49 (3.7%) -1.2% ( -6% - 4%) Prefix3 210.10 (3.8%) 207.61 (5.3%) -1.2% ( -9% - 8%) AndHighLow 1180.40 (2.1%) 1166.82 (2.5%) -1.2% ( -5% - 3%) Respell 81.67 (1.5%) 81.41 (2.4%) -0.3% ( -4% - 3%) Fuzzy1 97.84 (1.3%) 97.64 (1.7%) -0.2% ( -3% - 2%) LowSloppyPhrase 120.00 (3.0%) 120.42 (3.0%) 0.4% ( -5% - 6%) MedPhrase 263.96 (4.8%) 265.05 (7.1%) 0.4% ( -10% - 12%) MedSloppyPhrase 15.97 (4.5%) 16.08 (4.6%) 0.7% ( -8% - 10%) MedTerm 175.47 (4.4%) 177.64 (5.7%) 1.2% ( -8% - 11%) HighPhrase 17.80 (6.3%) 18.20 (9.3%) 2.2% ( -12% - 19%) OrHighMed 54.02 (6.8%) 56.18 (7.7%) 4.0% ( -9% - 19%) OrHighLow 52.35 (6.9%) 54.61 (7.7%) 4.3% ( -9% - 20%) HighSloppyPhrase 12.85 (10.3%) 13.41 (11.9%) 4.4% ( -16% - 29%) OrHighHigh 25.34 (7.2%) 26.77 (8.3%) 5.6% ( -9% - 22%) HighTerm 119.17 (4.7%) 128.45 (7.1%) 7.8% ( -3% - 20%) OrHighNotLow 110.06 (6.4%) 119.06 (7.0%) 8.2% ( -4% - 23%) OrHighNotMed 91.03 (6.1%) 98.91 (6.4%) 8.7% ( -3% - 22%) OrNotHighHigh 51.53 (5.6%) 56.04 (6.4%) 8.8% ( -3% - 21%) OrHighNotHigh 30.45 (5.7%) 33.27 (6.2%) 9.2% ( -2% - 22%)
        Hide
        Michael McCandless added a comment -

        +1

        Show
        Michael McCandless added a comment - +1
        Hide
        David Smiley added a comment -

        Nice. Before/now, AFAIK norms take 1 byte per field per doc of heap. I looked over the patch briefly; does this essentially put norms off-heap? And I noticed it has more fidelity than a single byte; up to 8 in fact. Does this patch also bring in accurate norms or is something else required to enable actually utilize that?

        Show
        David Smiley added a comment - Nice. Before/now, AFAIK norms take 1 byte per field per doc of heap. I looked over the patch briefly; does this essentially put norms off-heap? And I noticed it has more fidelity than a single byte; up to 8 in fact. Does this patch also bring in accurate norms or is something else required to enable actually utilize that?
        Hide
        Uwe Schindler added a comment -

        There are some unrelated changes in ByteBufferIndexInput: You removed the offset=0 implementation. Did you find out that this optimization for offset=0 brings no additional performance?

        Show
        Uwe Schindler added a comment - There are some unrelated changes in ByteBufferIndexInput: You removed the offset=0 implementation. Did you find out that this optimization for offset=0 brings no additional performance?
        Hide
        Robert Muir added a comment -

        Before/now, AFAIK norms take 1 byte per field per doc of heap. I looked over the patch briefly; does this essentially put norms off-heap?

        In the worst case. Currently they are compressed with bitpacking and other tricks to try to be reasonable. But what was missing all along was a random access api in Directory so that this can just be MappedByteBuffer.get(long) (see linked issue and justification). If you want them to be in heap memory, use fileswitchdirectory and ramdirectory.

        Does this patch also bring in accurate norms or is something else required to enable actually utilize that?

        You have 1 byte norms because your chosen similarity squashes to that, but the interface between similarity and indexwriter is "long" since lucene 4 and all codecs test and support that.

        Show
        Robert Muir added a comment - Before/now, AFAIK norms take 1 byte per field per doc of heap. I looked over the patch briefly; does this essentially put norms off-heap? In the worst case. Currently they are compressed with bitpacking and other tricks to try to be reasonable. But what was missing all along was a random access api in Directory so that this can just be MappedByteBuffer.get(long) (see linked issue and justification). If you want them to be in heap memory, use fileswitchdirectory and ramdirectory. Does this patch also bring in accurate norms or is something else required to enable actually utilize that? You have 1 byte norms because your chosen similarity squashes to that, but the interface between similarity and indexwriter is "long" since lucene 4 and all codecs test and support that.
        Hide
        Robert Muir added a comment -

        There are some unrelated changes in ByteBufferIndexInput: You removed the offset=0 implementation. Did you find out that this optimization for offset=0 brings no additional performance?

        Its not unrelated, i did extensive benchmarking. That optimization is a trap, allowing too many low level implementations (3) once the index gets large.

        Show
        Robert Muir added a comment - There are some unrelated changes in ByteBufferIndexInput: You removed the offset=0 implementation. Did you find out that this optimization for offset=0 brings no additional performance? Its not unrelated, i did extensive benchmarking. That optimization is a trap, allowing too many low level implementations (3) once the index gets large.
        Hide
        Uwe Schindler added a comment - - edited

        Its not unrelated, i did extensive benchmarking. That optimization is a trap, allowing too many low level implementations (3) once the index gets large.

        OK. That's fine. I never liked that offset=0 specialization, i think back at that time we felt it might be a good idea. I don't know the issue number anymore.

        You also removed the negative check in the new Multi implementation; I think this was done because of performance, too, right? I am not happy with that, because it no longer throws Exception if you try to access stuff off-slice. Maybe we can add an "assert" instead?

        Show
        Uwe Schindler added a comment - - edited Its not unrelated, i did extensive benchmarking. That optimization is a trap, allowing too many low level implementations (3) once the index gets large. OK. That's fine. I never liked that offset=0 specialization, i think back at that time we felt it might be a good idea. I don't know the issue number anymore. You also removed the negative check in the new Multi implementation; I think this was done because of performance, too, right? I am not happy with that, because it no longer throws Exception if you try to access stuff off-slice. Maybe we can add an "assert" instead?
        Hide
        Uwe Schindler added a comment -

        Sorry, the assert is there. Cancel my comment!

        Show
        Uwe Schindler added a comment - Sorry, the assert is there. Cancel my comment!
        Hide
        Adrien Grand added a comment -

        +1

        Show
        Adrien Grand added a comment - +1
        Hide
        Adrien Grand added a comment -

        +1

        Show
        Adrien Grand added a comment - +1
        Hide
        ASF subversion and git services added a comment -

        Commit 1685007 from Robert Muir in branch 'dev/trunk'
        [ https://svn.apache.org/r1685007 ]

        LUCENE-6504: implement norms with random access API

        Show
        ASF subversion and git services added a comment - Commit 1685007 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1685007 ] LUCENE-6504 : implement norms with random access API
        Hide
        ASF subversion and git services added a comment -

        Commit 1685011 from Robert Muir in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1685011 ]

        LUCENE-6504: implement norms with random access API

        Show
        ASF subversion and git services added a comment - Commit 1685011 from Robert Muir in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1685011 ] LUCENE-6504 : implement norms with random access API
        Hide
        Shalin Shekhar Mangar added a comment -

        Bulk close for 5.3.0 release

        Show
        Shalin Shekhar Mangar added a comment - Bulk close for 5.3.0 release

          People

          • Assignee:
            Unassigned
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development