Lucene - Core
  1. Lucene - Core
  2. LUCENE-1604

Stop creating huge arrays to represent the absense of field norms

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 2.9
    • Fix Version/s: 2.9
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Creating and keeping around huge arrays that hold a constant value is very inefficient both from a heap usage standpoint and from a localility of reference standpoint. It would be much more efficient to use null to represent a missing norms table.

      1. LUCENE-1604.patch
        26 kB
        Michael McCandless
      2. LUCENE-1604.patch
        17 kB
        Shon Vella
      3. LUCENE-1604.patch
        14 kB
        Shon Vella
      4. LUCENE-1604.patch
        13 kB
        Shon Vella

        Activity

        Hide
        Shon Vella added a comment -

        This is similar to LUCENE-505, but applies to all readers, not just MultiReader and proposes a different, less complex solution. Attaching preliminary patch.

        Show
        Shon Vella added a comment - This is similar to LUCENE-505 , but applies to all readers, not just MultiReader and proposes a different, less complex solution. Attaching preliminary patch.
        Hide
        Shon Vella added a comment -

        Preliminary patch

        Show
        Shon Vella added a comment - Preliminary patch
        Hide
        Michael McCandless added a comment -

        I completely agree it's silly to make huge arrays instead of accepting null. One turns off norms to avoid huge arrays getting allocated, in the first place.

        Unfortunately, this is a break in back compatibility, though I think in 3.0 this change would be OK. Perhaps, we could commit it today, w/ a deprecated method exposed in IndexReader to "allow null to be returned by getNorms()", which defaults to off (ie the current trunk behavior). Then, in 3.0, we remove that method and hardwire to true.

        There is also a [presumably smallish] performance hit by adding the "norms != null" check inside TermScorer, for every hit, but I think that's an OK tradeoff.

        Show
        Michael McCandless added a comment - I completely agree it's silly to make huge arrays instead of accepting null. One turns off norms to avoid huge arrays getting allocated, in the first place. Unfortunately, this is a break in back compatibility, though I think in 3.0 this change would be OK. Perhaps, we could commit it today, w/ a deprecated method exposed in IndexReader to "allow null to be returned by getNorms()", which defaults to off (ie the current trunk behavior). Then, in 3.0, we remove that method and hardwire to true. There is also a [presumably smallish] performance hit by adding the "norms != null" check inside TermScorer, for every hit, but I think that's an OK tradeoff.
        Hide
        Shon Vella added a comment -

        Will look at adding the option to turn on as time permits. Shouldn't be too hard except as it affects the unit tests - ferreting out all the tests that need to be duplicated with it on and with it off could get very tedious.

        Show
        Shon Vella added a comment - Will look at adding the option to turn on as time permits. Shouldn't be too hard except as it affects the unit tests - ferreting out all the tests that need to be duplicated with it on and with it off could get very tedious.
        Hide
        Earwin Burrfoot added a comment -

        There is also a [presumably smallish] performance hit by adding the "norms != null" check inside TermScorer, for every hit, but I think that's an OK tradeoff.

        If guys at Java HotSpot Compiler are doing their homework, checking something != null before accessing something should come almost for free, because each access checks for null anyway.

        Show
        Earwin Burrfoot added a comment - There is also a [presumably smallish] performance hit by adding the "norms != null" check inside TermScorer, for every hit, but I think that's an OK tradeoff. If guys at Java HotSpot Compiler are doing their homework, checking something != null before accessing something should come almost for free, because each access checks for null anyway.
        Hide
        Michael McCandless added a comment -

        If guys at Java HotSpot Compiler are doing their homework, checking something != null before accessing something should come almost for free, because each access checks for null anyway.

        The problem is, depending on the result of the null check, we do very different things; I think we are relying on the CPU to correctly predict the branch, which I even wonder about since multiple threads could be running that code with the if going different ways. And even a correct prediction there's still [small] added cost...

        Show
        Michael McCandless added a comment - If guys at Java HotSpot Compiler are doing their homework, checking something != null before accessing something should come almost for free, because each access checks for null anyway. The problem is, depending on the result of the null check, we do very different things; I think we are relying on the CPU to correctly predict the branch, which I even wonder about since multiple threads could be running that code with the if going different ways. And even a correct prediction there's still [small] added cost...
        Hide
        Michael McCandless added a comment -

        ferreting out all the tests that need to be duplicated with it on and with it off could get very tedious.

        I don't think you need to dup all tests w/ on & off. If you can get Lucene's tests to pass when [temporarily] hardwiring it to on (ie, allowed to return null), I think that's fine? Then, for the duration up until 2.9, we're testing the "off" case, and when we flip it in 3.0, we're testing the "on" case.

        Show
        Michael McCandless added a comment - ferreting out all the tests that need to be duplicated with it on and with it off could get very tedious. I don't think you need to dup all tests w/ on & off. If you can get Lucene's tests to pass when [temporarily] hardwiring it to on (ie, allowed to return null), I think that's fine? Then, for the duration up until 2.9, we're testing the "off" case, and when we flip it in 3.0, we're testing the "on" case.
        Hide
        Shon Vella added a comment -

        The check isn't for free because the HotSpot compiler doesn't check every reference for null, it just traps the resulting segmentation fault that dereferencing a null pointer causes. We could leverage this by not checking for null, but catch a null pointer exception, though I believe the extra overhead of a try/catch block would be more expensive than just checking.

        The performance hit is likely only in the case where it isn't equal null - in the equal null case you end up saving a multiply plus a memory reference that is likely going to cause many CPU cache faults over the coarse of a search long enough to matter, which together probably add up to more than the cost of the check.

        An alternate approach that would eliminate this overhead is to subclass the scorers that use norms and create an appropriate scorer that doesn't require the check. The drawback of this approach would be that it is harder to maintain.

        Show
        Shon Vella added a comment - The check isn't for free because the HotSpot compiler doesn't check every reference for null, it just traps the resulting segmentation fault that dereferencing a null pointer causes. We could leverage this by not checking for null, but catch a null pointer exception, though I believe the extra overhead of a try/catch block would be more expensive than just checking. The performance hit is likely only in the case where it isn't equal null - in the equal null case you end up saving a multiply plus a memory reference that is likely going to cause many CPU cache faults over the coarse of a search long enough to matter, which together probably add up to more than the cost of the check. An alternate approach that would eliminate this overhead is to subclass the scorers that use norms and create an appropriate scorer that doesn't require the check. The drawback of this approach would be that it is harder to maintain.
        Hide
        Earwin Burrfoot added a comment -

        Yep, that was my blunder.

        An alternate approach that would eliminate this overhead is to subclass the scorers that use norms and create an appropriate scorer that doesn't require the check. The drawback of this approach would be that it is harder to maintain.

        Somebody recently raised a topic on specialized Lucene, with classes generated from templates, hardwiring various choices. Guys at MG4J generate a custom indexReader for each possible combination of index settings. So the idea of separate scorers might be valid, if we're going optimization-freak way. (and we don't have to support them separately)

        By the way, why everything surrounding norms map is heavily synchronized? I haven't found a single write to the map outside of initialize().

        Show
        Earwin Burrfoot added a comment - Yep, that was my blunder. An alternate approach that would eliminate this overhead is to subclass the scorers that use norms and create an appropriate scorer that doesn't require the check. The drawback of this approach would be that it is harder to maintain. Somebody recently raised a topic on specialized Lucene, with classes generated from templates, hardwiring various choices. Guys at MG4J generate a custom indexReader for each possible combination of index settings. So the idea of separate scorers might be valid, if we're going optimization-freak way. (and we don't have to support them separately) By the way, why everything surrounding norms map is heavily synchronized? I haven't found a single write to the map outside of initialize().
        Hide
        Michael McCandless added a comment -

        By the way, why everything surrounding norms map is heavily synchronized? I haven't found a single write to the map outside of initialize().

        You mean the SegmentReader.Norm class? It's so only one thread attempts to load norms at once. And because cloned SegmentReaders share norms (incRef/decRef). Also, the reader can change norms (setNorm) which requires copy-on-write if more than one clone is sharing the norms. We'll need similar care for CSFs once they accept live changes, too.

        Somebody recently raised a topic on specialized Lucene

        Right, that's LUCENE-1594. I'll fold the norms/nonorms code gen into there....

        Source code specialization gives sizable search performance gains w/ Lucene, but it'll be quite a while before it's committable. Hopefully as we pull some of the optimizations there back into core (eg using random-access API to filter: LUCENE-1536), that gap shrinks.

        Let's proceed for now with the null check in single-source scorer. I'll run some perf tests on it vs trunk, with the "norms present" case, to see where we stand.

        Show
        Michael McCandless added a comment - By the way, why everything surrounding norms map is heavily synchronized? I haven't found a single write to the map outside of initialize(). You mean the SegmentReader.Norm class? It's so only one thread attempts to load norms at once. And because cloned SegmentReaders share norms (incRef/decRef). Also, the reader can change norms (setNorm) which requires copy-on-write if more than one clone is sharing the norms. We'll need similar care for CSFs once they accept live changes, too. Somebody recently raised a topic on specialized Lucene Right, that's LUCENE-1594 . I'll fold the norms/nonorms code gen into there.... Source code specialization gives sizable search performance gains w/ Lucene, but it'll be quite a while before it's committable. Hopefully as we pull some of the optimizations there back into core (eg using random-access API to filter: LUCENE-1536 ), that gap shrinks. Let's proceed for now with the null check in single-source scorer. I'll run some perf tests on it vs trunk, with the "norms present" case, to see where we stand.
        Hide
        Shon Vella added a comment -

        Patch updated to add method to enable/disable the disabling of fake norms

        Show
        Shon Vella added a comment - Patch updated to add method to enable/disable the disabling of fake norms
        Hide
        Michael McCandless added a comment -

        OK, patch looks good. All tests pass, even if I temporarily default "disableFakeNorms" to true (but back-compat tests fail, which is expected and is why we won't flip the default until 3.0). Thanks Shon!

        I still need to test perf cost of this change...

        Show
        Michael McCandless added a comment - OK, patch looks good. All tests pass, even if I temporarily default "disableFakeNorms" to true (but back-compat tests fail, which is expected and is why we won't flip the default until 3.0). Thanks Shon! I still need to test perf cost of this change...
        Hide
        Shon Vella added a comment -

        Working on an update to the patch - MultiSegmentReader needs to set disableFakeNorms transitively to it's subReaders as well as to new subReaders on reopen.

        Show
        Shon Vella added a comment - Working on an update to the patch - MultiSegmentReader needs to set disableFakeNorms transitively to it's subReaders as well as to new subReaders on reopen.
        Hide
        Shon Vella added a comment - - edited

        Setting disableFakeNorms transitively isn't really needed because MultiSegmentReader doesn't make any calls to the subreaders that would cause it to create it's own fake norms. We probably ought to preserve the flag on clone() and reopen() though, which is going to be a little messy because IndexReader doesn't really implement either so it would have to be handled at the root of each concrete class hierarchy that does implement those. Any thoughts on whether we need this or not?

        Show
        Shon Vella added a comment - - edited Setting disableFakeNorms transitively isn't really needed because MultiSegmentReader doesn't make any calls to the subreaders that would cause it to create it's own fake norms. We probably ought to preserve the flag on clone() and reopen() though, which is going to be a little messy because IndexReader doesn't really implement either so it would have to be handled at the root of each concrete class hierarchy that does implement those. Any thoughts on whether we need this or not?
        Hide
        Michael McCandless added a comment -

        Setting disableFakeNorms transitively isn't really needed because MultiSegmentReader doesn't make any calls to the subreaders that would cause it to create it's own fake norms

        But since we score per-segment, TermScorer would ask each SegmentReader (in the MultiSegmentReader) for its norms? So I think the sub readers need to know the setting.

        Any thoughts on whether we need this or not?

        I think we do need each class implementing clone() and reopen() to properly carryover this setting.

        Show
        Michael McCandless added a comment - Setting disableFakeNorms transitively isn't really needed because MultiSegmentReader doesn't make any calls to the subreaders that would cause it to create it's own fake norms But since we score per-segment, TermScorer would ask each SegmentReader (in the MultiSegmentReader) for its norms? So I think the sub readers need to know the setting. Any thoughts on whether we need this or not? I think we do need each class implementing clone() and reopen() to properly carryover this setting.
        Hide
        Shon Vella added a comment -

        What should the transitive behavior of MultiReader, FilterReader, and ParallelReader be? I'm inclined to say they shouldn't pass through to their subordinate readers because they don't really "own" them.

        Show
        Shon Vella added a comment - What should the transitive behavior of MultiReader, FilterReader, and ParallelReader be? I'm inclined to say they shouldn't pass through to their subordinate readers because they don't really "own" them.
        Hide
        Michael McCandless added a comment -

        I'm inclined to say they shouldn't pass through to their subordinate readers because they don't really "own" them.

        I agree.

        Show
        Michael McCandless added a comment - I'm inclined to say they shouldn't pass through to their subordinate readers because they don't really "own" them. I agree.
        Hide
        Shon Vella added a comment -

        Updated patch that preserves disableNorms flag across clone and reopen() and applies flag transitively to MultiSegmentReader.

        Show
        Shon Vella added a comment - Updated patch that preserves disableNorms flag across clone and reopen() and applies flag transitively to MultiSegmentReader.
        Hide
        Michael McCandless added a comment -

        I tested this change on a Wikipedia index, with query "1", on a field
        that has norms.

        On Linux, JDK 1.6.0_13, I can see no performance difference (both get
        7.2 qps, best of 10 runs).

        On Mac OS X 10.5.6, I see some difference (13.0 vs 12.3, best of 10
        runs), but given quirkiness I've seen on OS X's results not matching
        other platforms, I think we can disgregard this.

        Also, given the performance gain one sees when norms are disabled, I
        think this is net/net a good change.

        We'll leave the default as false (for back compat), but this setting
        is deprecated with a comment that in 3.0 it hardwires to true.

        Show
        Michael McCandless added a comment - I tested this change on a Wikipedia index, with query "1", on a field that has norms. On Linux, JDK 1.6.0_13, I can see no performance difference (both get 7.2 qps, best of 10 runs). On Mac OS X 10.5.6, I see some difference (13.0 vs 12.3, best of 10 runs), but given quirkiness I've seen on OS X's results not matching other platforms, I think we can disgregard this. Also, given the performance gain one sees when norms are disabled, I think this is net/net a good change. We'll leave the default as false (for back compat), but this setting is deprecated with a comment that in 3.0 it hardwires to true.
        Hide
        Michael McCandless added a comment -

        New patch attached:

        • Fixed contrib/instantiated & contrib/misc to pass if I change
          default for disableFakeNorms to true (which we will hardwire in
          3.0)
        • Tweaked javadocs
        • Removed unused imports
        • Added CHANGES.txt entry

        I still need to review the rest of the patch...

        With this patch, all tests pass with the default set to false
        (back-compat). If I temporarily set it to true, all tests now pass,
        except back-compat (which is expected & fine).

        I had started down the path of having contrib/instantiated "respect"
        the disableFakeNorms setting, but rapidly came to realize how little I
        understand contrib/instantiated's code So I fell back to fixing the
        unit tests to accept null returns from the normal
        IndexReader.norms(...).

        Show
        Michael McCandless added a comment - New patch attached: Fixed contrib/instantiated & contrib/misc to pass if I change default for disableFakeNorms to true (which we will hardwire in 3.0) Tweaked javadocs Removed unused imports Added CHANGES.txt entry I still need to review the rest of the patch... With this patch, all tests pass with the default set to false (back-compat). If I temporarily set it to true, all tests now pass, except back-compat (which is expected & fine). I had started down the path of having contrib/instantiated "respect" the disableFakeNorms setting, but rapidly came to realize how little I understand contrib/instantiated's code So I fell back to fixing the unit tests to accept null returns from the normal IndexReader.norms(...).
        Hide
        Michael McCandless added a comment -

        Attached patch.

        I also added "assert !getDisableFakeNorms();" inside SegmentReader.fakeNorms().

        Show
        Michael McCandless added a comment - Attached patch. I also added "assert !getDisableFakeNorms();" inside SegmentReader.fakeNorms().
        Hide
        Michael McCandless added a comment -

        OK patch looks good! I plan to commit in a day or two. Thanks Shon!

        Show
        Michael McCandless added a comment - OK patch looks good! I plan to commit in a day or two. Thanks Shon!
        Hide
        Michael McCandless added a comment -

        Thanks Shon!

        Show
        Michael McCandless added a comment - Thanks Shon!

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Shon Vella
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development