Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.9
    • Component/s: core/search
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment. We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...

      • entries for the same reader/field with different types/parsers
      • entries for the same field/type/parser in a reader and it's subreader(s)
      • etc...
      1. LUCENE-1749.patch
        102 kB
        Hoss Man
      2. LUCENE-1749.patch
        100 kB
        Hoss Man
      3. LUCENE-1749.patch
        100 kB
        Hoss Man
      4. LUCENE-1749.patch
        99 kB
        Hoss Man
      5. LUCENE-1749.patch
        98 kB
        Hoss Man
      6. LUCENE-1749.patch
        91 kB
        Mark Miller
      7. LUCENE-1749.patch
        103 kB
        Mark Miller
      8. LUCENE-1749.patch
        106 kB
        Mark Miller
      9. LUCENE-1749.patch
        107 kB
        Mark Miller
      10. LUCENE-1749-hossfork.patch
        94 kB
        Hoss Man
      11. LUCENE-1749.patch
        87 kB
        Mark Miller
      12. LUCENE-1749.patch
        83 kB
        Mark Miller
      13. LUCENE-1749.patch
        60 kB
        Hoss Man
      14. LUCENE-1749.patch
        52 kB
        Hoss Man
      15. LUCENE-1749.patch
        50 kB
        Hoss Man
      16. LUCENE-1749.patch
        47 kB
        Hoss Man
      17. LUCENE-1749.patch
        31 kB
        Mark Miller
      18. fieldcache-introspection.patch
        24 kB
        Hoss Man

        Issue Links

          Activity

          Hide
          Hoss Man added a comment -

          The motivation for this issue is all of the changes coming in 2.9 in how Lucene internally uses the FieldCache API – the biggest change being per Segment sorting, but there may be others not immediately obvious.

          While these changes are backwards compatible from an API and functionality perspective, they could have some pretty serious performance impacts for existing apps that also use the FieldCache directly and after upgrading the apps suddenly seem slower to start (because of redundant FieldCache initialization) and require 2X as much RAM as they did before. This could lead people people to assume Lucene has suddenly became a major memory hog. SOLR-1111 and SOLR-1247 are some quick examples of the types of problems that apps could encounter.

          Currently the only way for a User to even notice the problem is to do memory profiling, and the FieldCache data structure isn't the easiest to understand. It would be a lot nicer to have some methods for doing this inspection programaticly, so users could write automated tests for incorrect/redundent usage.

          Show
          Hoss Man added a comment - The motivation for this issue is all of the changes coming in 2.9 in how Lucene internally uses the FieldCache API – the biggest change being per Segment sorting, but there may be others not immediately obvious. While these changes are backwards compatible from an API and functionality perspective, they could have some pretty serious performance impacts for existing apps that also use the FieldCache directly and after upgrading the apps suddenly seem slower to start (because of redundant FieldCache initialization) and require 2X as much RAM as they did before. This could lead people people to assume Lucene has suddenly became a major memory hog. SOLR-1111 and SOLR-1247 are some quick examples of the types of problems that apps could encounter. Currently the only way for a User to even notice the problem is to do memory profiling, and the FieldCache data structure isn't the easiest to understand. It would be a lot nicer to have some methods for doing this inspection programaticly, so users could write automated tests for incorrect/redundent usage.
          Hide
          Hoss Man added a comment -

          Here's the start of a patch to provide this functionality – it just provides a new method/datastructure for inspecting the cache; the sanity checking utility methods should be straightforward assuming people think this is a good idea.

          The new method itself is fairly simple, but quite a bit of refactoring to how the caches are managed was necessary to make it possible to implement the method sanely. These changes to the FieldCache internals seem like they are generally a good idea from a maintenance standpoint even if people don't like the new method.

          Show
          Hoss Man added a comment - Here's the start of a patch to provide this functionality – it just provides a new method/datastructure for inspecting the cache; the sanity checking utility methods should be straightforward assuming people think this is a good idea. The new method itself is fairly simple, but quite a bit of refactoring to how the caches are managed was necessary to make it possible to implement the method sanely. These changes to the FieldCache internals seem like they are generally a good idea from a maintenance standpoint even if people don't like the new method.
          Hide
          Hoss Man added a comment -

          Technically this isn't a bug, so i probably shouldn't add it to the 2.9 blocker list, but i really think it would be a good idea to have something like this in the 2.9 release.

          At the very least: i'd like to put it on the list until/unless there is consensus that it's not needed.

          Show
          Hoss Man added a comment - Technically this isn't a bug, so i probably shouldn't add it to the 2.9 blocker list, but i really think it would be a good idea to have something like this in the 2.9 release. At the very least: i'd like to put it on the list until/unless there is consensus that it's not needed.
          Hide
          Mark Miller added a comment -

          nice - would be great if it could estimate ram usage as well.

          Show
          Mark Miller added a comment - nice - would be great if it could estimate ram usage as well.
          Hide
          Michael McCandless added a comment -

          +1 – this'd be great to get into 2.9.

          Show
          Michael McCandless added a comment - +1 – this'd be great to get into 2.9.
          Hide
          Uwe Schindler added a comment -

          Looks good as a start, one question about a comment:

          What do you mean with:

          • :TODO: is the "int" sort type still needed? ... doesn't seem to be used anywhere, code just tests "custom" for SortComparator vs Parser.

          I do not understand, do you want to remove the IntCache? What is different with it in comparison with the other ones?

          Uwe

          Show
          Uwe Schindler added a comment - Looks good as a start, one question about a comment: What do you mean with: :TODO: is the "int" sort type still needed? ... doesn't seem to be used anywhere, code just tests "custom" for SortComparator vs Parser. I do not understand, do you want to remove the IntCache? What is different with it in comparison with the other ones? Uwe
          Hide
          Hoss Man added a comment -

          :TODO: is the "int" sort type still needed? ... doesn't seem to be used anywhere, code just tests "custom" for SortComparator vs Parser.

          sorry ... badly placed quotes ... that was in referent to Entry.type.

          Until i changed getStrings, getStringIndex, and getAuto to construct Entry objects as part of my refactoring the "type" attribute (and the constructor that takes a type argument) didnt' seem to be used anywhere (as far as i could tell)

          My guess: maybe some previous changes refactored logic that switched on type up into the SortFields?, so the FieldCache no longer needs to care about it?

          Show
          Hoss Man added a comment - :TODO: is the "int" sort type still needed? ... doesn't seem to be used anywhere, code just tests "custom" for SortComparator vs Parser. sorry ... badly placed quotes ... that was in referent to Entry.type. Until i changed getStrings, getStringIndex, and getAuto to construct Entry objects as part of my refactoring the "type" attribute (and the constructor that takes a type argument) didnt' seem to be used anywhere (as far as i could tell) My guess: maybe some previous changes refactored logic that switched on type up into the SortFields?, so the FieldCache no longer needs to care about it?
          Hide
          Mark Miller added a comment -

          Here is a start towards guessing the fieldcache ram usage.

          It probably works fairly well, though it will be limited by stack space on a very heavily nested object graph.

          I've added the size guess for getValue in the introspection output.

          Its a start anyway.

          Show
          Mark Miller added a comment - Here is a start towards guessing the fieldcache ram usage. It probably works fairly well, though it will be limited by stack space on a very heavily nested object graph. I've added the size guess for getValue in the introspection output. Its a start anyway.
          Hide
          Mark Miller added a comment - - edited

          We prob would want to provide an alternate toString that includes the ram guess and the default that skips it - i havn't tested performance, but it might take a while to check a gigantic string array.

          Also, JavaImpl should probably actually be JavaMemoryModel or MemoryModel.

          Show
          Mark Miller added a comment - - edited We prob would want to provide an alternate toString that includes the ram guess and the default that skips it - i havn't tested performance, but it might take a while to check a gigantic string array. Also, JavaImpl should probably actually be JavaMemoryModel or MemoryModel.
          Hide
          Hoss Man added a comment -

          More progress building on Mark's patch.

          added some sanity checking that reader/fieldname combos aren't reused in odd ways – i made it ignore cases where different parsers ultimately resolve to identical cache objects (ie null vs DEFAULT_LONG_PARSER) and it ignores any CreationPlaceholder objects (not sure about that one)

          some tests were modified to make their pathological behavior more "sane" and hooks were addded so that future tests can bypass the sanity testing in the tearDown method if they really need to.

          Still need sanity testing of the Reader/subreader variety. also lots of docs and code cleanup.

          BTW: i was focused on test-core ... still waiting on test-contrib to finish running, so i'm not yet sure if i broke anything there.

          Show
          Hoss Man added a comment - More progress building on Mark's patch. added some sanity checking that reader/fieldname combos aren't reused in odd ways – i made it ignore cases where different parsers ultimately resolve to identical cache objects (ie null vs DEFAULT_LONG_PARSER) and it ignores any CreationPlaceholder objects (not sure about that one) some tests were modified to make their pathological behavior more "sane" and hooks were addded so that future tests can bypass the sanity testing in the tearDown method if they really need to. Still need sanity testing of the Reader/subreader variety. also lots of docs and code cleanup. BTW: i was focused on test-core ... still waiting on test-contrib to finish running, so i'm not yet sure if i broke anything there.
          Hide
          Hoss Man added a comment -

          note to self: of the contribs, TestRemoteSort had two failed tests (not horribly surprising) and PatternAnalyzerTest generated an Error (?!?!) ...

          java.lang.IllegalStateException: termText
              at org.apache.lucene.index.memory.PatternAnalyzerTest.assertEquals(PatternAnalyzerTest.java:213)
              at org.apache.lucene.index.memory.PatternAnalyzerTest.run(PatternAnalyzerTest.java:148)
              at org.apache.lucene.index.memory.PatternAnalyzerTest.testMany(PatternAnalyzerTest.java:87)
          
          Show
          Hoss Man added a comment - note to self: of the contribs, TestRemoteSort had two failed tests (not horribly surprising) and PatternAnalyzerTest generated an Error (?!?!) ... java.lang.IllegalStateException: termText at org.apache.lucene.index.memory.PatternAnalyzerTest.assertEquals(PatternAnalyzerTest.java:213) at org.apache.lucene.index.memory.PatternAnalyzerTest.run(PatternAnalyzerTest.java:148) at org.apache.lucene.index.memory.PatternAnalyzerTest.testMany(PatternAnalyzerTest.java:87)
          Hide
          Hoss Man added a comment -

          minor checkpoint: improved assert messages, and massaged TestRemoteSort so that it appearers more sane.

          problem with PatternAnalyzerTest was unrelated (see LUCENE-1756)

          Show
          Hoss Man added a comment - minor checkpoint: improved assert messages, and massaged TestRemoteSort so that it appearers more sane. problem with PatternAnalyzerTest was unrelated (see LUCENE-1756 )
          Hide
          Hoss Man added a comment -

          Hmmm... somehow i overlooked the fact that even after i "fixed" TestRemoteSort in the last patch, it's still failing. Here's the assertion failure...

          junit.framework.AssertionFailedError: testRemoteCustomSort Comparator: multi FieldCaches for same reader/fieldname with diff types
             at org.apache.lucene.util.LuceneTestCase.assertSaneFieldCaches(LuceneTestCase.java:110)
             at org.apache.lucene.search.TestRemoteSort.testRemoteCustomSort(TestRemoteSort.java:261)
             at org.apache.lucene.util.LuceneTestCase.runTest(LuceneTestCase.java:265)
          

          ...and here's the debugging dump of the FieldCache...

          *** BEGIN testRemoteCustomSort Comparator: FieldCache Losers ***
          'org.apache.lucene.index.DirectoryReader@1108727'=>'custom',interface java.lang.Comparable,9,org.apache.lucene.search.SampleComparable$2@651e95,null=>[Ljava.lang.Comparable;#22056753 size guess:2 KB
          'org.apache.lucene.index.DirectoryReader@1108727'=>'custom',interface java.lang.Comparable,9,org.apache.lucene.search.SampleComparable$2@5b78cf,null=>[Ljava.lang.Comparable;#32045680 size guess:2 KB
          *** END testRemoteCustomSort Comparator: FieldCache Losers ***
          *** BEGIN org.apache.lucene.search.TestRemoteSort.testRemoteCustomSort: FieldCache Losers ***
          'org.apache.lucene.index.DirectoryReader@1108727'=>'custom',interface java.lang.Comparable,9,org.apache.lucene.search.SampleComparable$2@651e95,null=>[Ljava.lang.Comparable;#22056753 size guess:2 KB
          'org.apache.lucene.index.DirectoryReader@1108727'=>'custom',interface java.lang.Comparable,9,org.apache.lucene.search.SampleComparable$2@5b78cf,null=>[Ljava.lang.Comparable;#32045680 size guess:2 KB
          *** END org.apache.lucene.search.TestRemoteSort.testRemoteCustomSort: FieldCache Losers ***
          

          What really confuses me about this is that the same SampleComparable instance is being used with two different queries – once with reverse=true and once with reverse=false, yet two different SampleComparable instances are showing up in cache keys – the probably only happens when SampleComparable is used to get a SortComparator – not when it uses a ComparatorSource earlier in the test.

          Is this a real bug in remote sorting?

          Show
          Hoss Man added a comment - Hmmm... somehow i overlooked the fact that even after i "fixed" TestRemoteSort in the last patch, it's still failing. Here's the assertion failure... junit.framework.AssertionFailedError: testRemoteCustomSort Comparator: multi FieldCaches for same reader/fieldname with diff types at org.apache.lucene.util.LuceneTestCase.assertSaneFieldCaches(LuceneTestCase.java:110) at org.apache.lucene.search.TestRemoteSort.testRemoteCustomSort(TestRemoteSort.java:261) at org.apache.lucene.util.LuceneTestCase.runTest(LuceneTestCase.java:265) ...and here's the debugging dump of the FieldCache... *** BEGIN testRemoteCustomSort Comparator: FieldCache Losers *** 'org.apache.lucene.index.DirectoryReader@1108727'=>'custom', interface java.lang.Comparable,9,org.apache.lucene.search.SampleComparable$2@651e95, null =>[Ljava.lang.Comparable;#22056753 size guess:2 KB 'org.apache.lucene.index.DirectoryReader@1108727'=>'custom', interface java.lang.Comparable,9,org.apache.lucene.search.SampleComparable$2@5b78cf, null =>[Ljava.lang.Comparable;#32045680 size guess:2 KB *** END testRemoteCustomSort Comparator: FieldCache Losers *** *** BEGIN org.apache.lucene.search.TestRemoteSort.testRemoteCustomSort: FieldCache Losers *** 'org.apache.lucene.index.DirectoryReader@1108727'=>'custom', interface java.lang.Comparable,9,org.apache.lucene.search.SampleComparable$2@651e95, null =>[Ljava.lang.Comparable;#22056753 size guess:2 KB 'org.apache.lucene.index.DirectoryReader@1108727'=>'custom', interface java.lang.Comparable,9,org.apache.lucene.search.SampleComparable$2@5b78cf, null =>[Ljava.lang.Comparable;#32045680 size guess:2 KB *** END org.apache.lucene.search.TestRemoteSort.testRemoteCustomSort: FieldCache Losers *** What really confuses me about this is that the same SampleComparable instance is being used with two different queries – once with reverse=true and once with reverse=false, yet two different SampleComparable instances are showing up in cache keys – the probably only happens when SampleComparable is used to get a SortComparator – not when it uses a ComparatorSource earlier in the test. Is this a real bug in remote sorting?
          Hide
          Mark Miller added a comment -

          Is this a real bug in remote sorting

          I think so.

          SortField is constructed on the client side and passed as a param to the remote Searchable. It seems you get a new factory every time this happens, not the client factory on the SortField (of course, because its constructed on the other side of the wire).

          So every search call adds a new factory to the mix. If you just make the call once, the test will pass.

          Its a nice find.

          Show
          Mark Miller added a comment - Is this a real bug in remote sorting I think so. SortField is constructed on the client side and passed as a param to the remote Searchable. It seems you get a new factory every time this happens, not the client factory on the SortField (of course, because its constructed on the other side of the wire). So every search call adds a new factory to the mix. If you just make the call once, the test will pass. Its a nice find.
          Hide
          Mark Miller added a comment -

          Here is a patch that just updates the ram usage estimator code.

          Show
          Mark Miller added a comment - Here is a patch that just updates the ram usage estimator code.
          Hide
          Paul Smith added a comment -

          You know what would be absolute icing on the cake here would be some way during the introspection by some code looking for large sort fields that perhaps can be discarded/unloaded as needed (programmatically).

          What I'm thinking here is a use case we've come into where we have had to sort by subject. Well the unique # subjects gets pretty large, and while we still need to support the use case, it'd be nice to be able to periodically 'toss' sort fields like this so they don't hog memory permanently while the IndexReader is still in memory. (sorting by subject is used, just not often so a good candidate for tossing)

          Because we have multiple large IndexReaders open concurrently, it'd be nice to be able to scan periodically and kick out any unneeded ones.

          It's nice to be able to inspect and print out these, but even better if one can make changes based on what one finds.

          Show
          Paul Smith added a comment - You know what would be absolute icing on the cake here would be some way during the introspection by some code looking for large sort fields that perhaps can be discarded/unloaded as needed (programmatically). What I'm thinking here is a use case we've come into where we have had to sort by subject. Well the unique # subjects gets pretty large, and while we still need to support the use case, it'd be nice to be able to periodically 'toss' sort fields like this so they don't hog memory permanently while the IndexReader is still in memory. (sorting by subject is used, just not often so a good candidate for tossing) Because we have multiple large IndexReaders open concurrently, it'd be nice to be able to scan periodically and kick out any unneeded ones. It's nice to be able to inspect and print out these, but even better if one can make changes based on what one finds.
          Hide
          Hoss Man added a comment -

          Mark: i have a little time to work on this today ... do you have any updates that youv'e been working on locally (i noticed some patch add/retract from you in hte history)

          Paul: over in LUCENE-831 there was a lot of discussion and work done towards making the entire FieldCAche internals pluggable so you could customize the cache behavior all sorts of ways ... i feel out of the loop on that issue, but my understanding is that it was pushed back to 3.1 at the earliest because it wasn't clear how the APIs should be setup given the work being done with reopen and with moving FieldCache usage down to the subreaders.

          for now my goal with this issue (LUCENE-1749) is purely to provide an experimental (ie: no back compat expectations) API for app developers to use to sanity check that the changes in 2.9 havne't blown their RAM usage sky high.

          Show
          Hoss Man added a comment - Mark: i have a little time to work on this today ... do you have any updates that youv'e been working on locally (i noticed some patch add/retract from you in hte history) Paul: over in LUCENE-831 there was a lot of discussion and work done towards making the entire FieldCAche internals pluggable so you could customize the cache behavior all sorts of ways ... i feel out of the loop on that issue, but my understanding is that it was pushed back to 3.1 at the earliest because it wasn't clear how the APIs should be setup given the work being done with reopen and with moving FieldCache usage down to the subreaders. for now my goal with this issue ( LUCENE-1749 ) is purely to provide an experimental (ie: no back compat expectations) API for app developers to use to sanity check that the changes in 2.9 havne't blown their RAM usage sky high.
          Hide
          Mark Miller added a comment -

          I do - I removed that last patch because I just realized that it was missing everything but one class.

          Go ahead though - I'll merge with what you have.

          Show
          Mark Miller added a comment - I do - I removed that last patch because I just realized that it was missing everything but one class. Go ahead though - I'll merge with what you have.
          Hide
          Hoss Man added a comment -

          uh ... ok .. what kind of updates do you have locally? (no point in merging later if i'm just going to write stuff you've already written)

          Show
          Hoss Man added a comment - uh ... ok .. what kind of updates do you have locally? (no point in merging later if i'm just going to write stuff you've already written)
          Hide
          Mark Miller added a comment -

          No worries, the updates are to the ram estimator and other minor things (eg if something fails the sanity check, the error output comes out twice because of the double check in teardown ) - nothing feature wise at the moment. I'll see what you add first.

          Show
          Mark Miller added a comment - No worries, the updates are to the ram estimator and other minor things (eg if something fails the sanity check, the error output comes out twice because of the double check in teardown ) - nothing feature wise at the moment. I'll see what you add first.
          Hide
          Hoss Man added a comment -

          checkpoint: really hack implementation of checkFieldCacheSubReaderSanity that tells you when a FieldCache contains entries on the same field in a both wrapper/inner readers ... but it doesn't tell you which entries (and unlike checkFieldCacheTypeSanity it's no obvious just looking at the toString())

          This patch causes an error in TestStressSort and LOTS of errors in TestCustomScoreQuery, TestFieldScoreQuery, and TestOrdValues.

          I'd like to think these errors are just from tests doing abnormal things, and in the case of TesStressSort that may be true (it looks like it has some hinky reuse of readers in a MultiReader) but in the other test cases where it's a little easier to see at a glance what's going on, there's really nothing odd here – simple use of a single IndexSearcher to execute a CustomScoreQuery, ValueSourceQuery, etc... these are each causing multiple FieldCache instances to pop up for a single field (one keyed on the DirectoryReader and another keyed on a CompoundFileReader$CSIndexInput)

          i'm try to work on refactoring the sanity checking methods so they are easier to use (and write some tests for them to prove the work as expected on both sane/insane) but it would be helpful if someone who understands more about how FieldCaches should look (pushed into the subreaders) could tyr out this patch and let me know if these failures look legit.

          Show
          Hoss Man added a comment - checkpoint: really hack implementation of checkFieldCacheSubReaderSanity that tells you when a FieldCache contains entries on the same field in a both wrapper/inner readers ... but it doesn't tell you which entries (and unlike checkFieldCacheTypeSanity it's no obvious just looking at the toString()) This patch causes an error in TestStressSort and LOTS of errors in TestCustomScoreQuery, TestFieldScoreQuery, and TestOrdValues. I'd like to think these errors are just from tests doing abnormal things, and in the case of TesStressSort that may be true (it looks like it has some hinky reuse of readers in a MultiReader) but in the other test cases where it's a little easier to see at a glance what's going on, there's really nothing odd here – simple use of a single IndexSearcher to execute a CustomScoreQuery, ValueSourceQuery, etc... these are each causing multiple FieldCache instances to pop up for a single field (one keyed on the DirectoryReader and another keyed on a CompoundFileReader$CSIndexInput) i'm try to work on refactoring the sanity checking methods so they are easier to use (and write some tests for them to prove the work as expected on both sane/insane) but it would be helpful if someone who understands more about how FieldCaches should look (pushed into the subreaders) could tyr out this patch and let me know if these failures look legit.
          Hide
          Hoss Man added a comment -

          checkpoint: refactored the sanity checking code into a utility class and wrote tests specifically for it to prove it finds insane stuff.

          TODO:

          • clean up the api, make it less clunky (and not static)
            • return structured data showing exactly which combinations in FieldCache are insane
          • javadocs
          • figure out why previously mentioned tests are breaking (need help with this one ... don't know enough about the code these tests excercise)
          Show
          Hoss Man added a comment - checkpoint: refactored the sanity checking code into a utility class and wrote tests specifically for it to prove it finds insane stuff. TODO: clean up the api, make it less clunky (and not static) return structured data showing exactly which combinations in FieldCache are insane javadocs figure out why previously mentioned tests are breaking (need help with this one ... don't know enough about the code these tests excercise)
          Hide
          Mark Miller added a comment -

          figure out why previously mentioned tests are breaking (need help with this one ... don't know enough about the code these tests excercise

          Eh - its yucky. There are parts where the tests are passing the top level reader (say to a collector) when it should be using the sub readers. I fixed one
          But then there is more - looked at a couple more difficult ones that also pass the top level reader for the test.

          And then there is explain - IndexSearcher passes the top level reader to the weight explain, and valuesourcequery will get a fieldcache based on that reader. I guess that one is a bug.

          And there are prob a few other similar type things...

          Show
          Mark Miller added a comment - figure out why previously mentioned tests are breaking (need help with this one ... don't know enough about the code these tests excercise Eh - its yucky. There are parts where the tests are passing the top level reader (say to a collector) when it should be using the sub readers. I fixed one But then there is more - looked at a couple more difficult ones that also pass the top level reader for the test. And then there is explain - IndexSearcher passes the top level reader to the weight explain, and valuesourcequery will get a fieldcache based on that reader. I guess that one is a bug. And there are prob a few other similar type things...
          Hide
          Mark Miller added a comment -

          And then there is explain - IndexSearcher passes the top level reader to the weight explain, and valuesourcequery will get a fieldcache based on that reader. I guess that one is a bug.

          I don't even know what to do about this one. All I can think is that you pump out an explain for each sub reader - but thats pretty unhelpful.

          Perhaps the best we can do is javadoc the extra requirements that may be needed when you use explain?

          Show
          Mark Miller added a comment - And then there is explain - IndexSearcher passes the top level reader to the weight explain, and valuesourcequery will get a fieldcache based on that reader. I guess that one is a bug. I don't even know what to do about this one. All I can think is that you pump out an explain for each sub reader - but thats pretty unhelpful. Perhaps the best we can do is javadoc the extra requirements that may be needed when you use explain?
          Hide
          Mark Miller added a comment -

          Updates:

          • merged in updated ram usage estimator code
          • updated most failing tests to work without creating top level FieldCaches
          • removed offending calls to explain - I left nocommit comments here - depending on what we decide, we could turn off the subreader check for these
          • Turned off the subreader check for stress sort test - it sorts in back compat mode and compares to the new mode - so it loads both on purpose.
          • I don't remember if I touched anything else.

          tests pass now

          Show
          Mark Miller added a comment - Updates: merged in updated ram usage estimator code updated most failing tests to work without creating top level FieldCaches removed offending calls to explain - I left nocommit comments here - depending on what we decide, we could turn off the subreader check for these Turned off the subreader check for stress sort test - it sorts in back compat mode and compares to the new mode - so it loads both on purpose. I don't remember if I touched anything else. tests pass now
          Hide
          Hoss Man added a comment -

          Mark: thanks for looking into the tests.

          If the CustomScoreQuery class(es) push the FieldCache sage into the subReaders during scoring, then shouldn't the explain methods do the same thing? it definitely seems like a bug if getting score explanation from a query causes your memory footprint to double.

          Last night i thought over what a more useful API for hte sanity checker would like like ...
          My power is getting turned off for a few hours this afternoon so i'll work on it them and should have a much cleaner looking patch to post this evening.

          (BTW: random thought that occurred to me last night: wouldn't the simplest way to implement the RamEstimator just be to use vanilla java serialization to a custom OutputStream that just counted the bytes and sent them to /dev/null) ?

          Show
          Hoss Man added a comment - Mark: thanks for looking into the tests. If the CustomScoreQuery class(es) push the FieldCache sage into the subReaders during scoring, then shouldn't the explain methods do the same thing? it definitely seems like a bug if getting score explanation from a query causes your memory footprint to double. Last night i thought over what a more useful API for hte sanity checker would like like ... My power is getting turned off for a few hours this afternoon so i'll work on it them and should have a much cleaner looking patch to post this evening. (BTW: random thought that occurred to me last night: wouldn't the simplest way to implement the RamEstimator just be to use vanilla java serialization to a custom OutputStream that just counted the bytes and sent them to /dev/null) ?
          Hide
          Mark Miller added a comment -

          (BTW: random thought that occurred to me last night: wouldn't the simplest way to implement the RamEstimator just be to use vanilla java serialization to a custom OutputStream that just counted the bytes and sent them to /dev/null) ?

          That's one way to go. Its got its own little issues though - some bookkeeping stuff is not serialized, and extra info about class, fields is serialzied. Transient fields (niche issue for sure) would also not be serialized. Its def another way to get an estimate. I chose a different route after considering both (googled the topic for a bit and looked at some examples before choosing). I'd be open to another route, but I thought this method was fairly fast, accurate, and generic.

          Show
          Mark Miller added a comment - (BTW: random thought that occurred to me last night: wouldn't the simplest way to implement the RamEstimator just be to use vanilla java serialization to a custom OutputStream that just counted the bytes and sent them to /dev/null) ? That's one way to go. Its got its own little issues though - some bookkeeping stuff is not serialized, and extra info about class, fields is serialzied. Transient fields (niche issue for sure) would also not be serialized. Its def another way to get an estimate. I chose a different route after considering both (googled the topic for a bit and looked at some examples before choosing). I'd be open to another route, but I thought this method was fairly fast, accurate, and generic.
          Hide
          Mark Miller added a comment -

          If the CustomScoreQuery class(es) push the FieldCache sage into the subReaders during scoring, then shouldn't the explain methods do the same thing? it definitely seems like a bug if getting score explanation from a query causes your memory footprint to double.

          It should do the same thing - but thats sticky. If you push explain to the sub readers, you will get why it scored as it did for each subreader - not one top level explain. I won't deny its kind of bug - but I'm not sure at the moment what the best way to address it is. I'll look into the possibility of pushing the fieldcache access to the subreaders while leaving everything else at the top reader - I have no thoughts about the feasibility of that at the moment though. I guess it might be doable.

          Show
          Mark Miller added a comment - If the CustomScoreQuery class(es) push the FieldCache sage into the subReaders during scoring, then shouldn't the explain methods do the same thing? it definitely seems like a bug if getting score explanation from a query causes your memory footprint to double. It should do the same thing - but thats sticky. If you push explain to the sub readers, you will get why it scored as it did for each subreader - not one top level explain. I won't deny its kind of bug - but I'm not sure at the moment what the best way to address it is. I'll look into the possibility of pushing the fieldcache access to the subreaders while leaving everything else at the top reader - I have no thoughts about the feasibility of that at the moment though. I guess it might be doable.
          Hide
          Mark Miller added a comment -

          Here is a rough draft for an explain fix.

          Explain for custom and valuesource now drop to per segment to retrieve fieldcache values. Resolving this issue will also resolve LUCENE-1771.

          Show
          Mark Miller added a comment - Here is a rough draft for an explain fix. Explain for custom and valuesource now drop to per segment to retrieve fieldcache values. Resolving this issue will also resolve LUCENE-1771 .
          Hide
          Mark Miller added a comment -

          This issue was a fantastic idea by the way!

          Show
          Mark Miller added a comment - This issue was a fantastic idea by the way!
          Hide
          Hoss Man added a comment -

          This is a complete overhaul of the internals of FieldCacheSanityChecker, and it's API so that it's a lot cleaner and easier to use programaticly.

          This also makes it easier for tests to run an analsis, and then ignore the types of errors they "expect" without ignoring whole cache entries (so a test that expects to have subreader problems can ignore those, even if one of those cache entires also fails a sanity check with another entry for a different reason (ie: different parser on same reader)

          And in the long run: this should make it easier for us to add new types of sanity checks (which i'm guessing we'll think of when the internals get overhauled) without changing the API too much.

          NOTE: This is based on Miller's attachment #12414960 (29/Jul/09) ... i haven't merged in or looked at any of the changes he made after that. i suspect the only overlap (if any) is how CacheEntry uses the Ram Estimation code ... i switched to having estimateSize(RamUsageEstimator) cache the value and then toString uses it if it's there ... the FieldCacheSanityChecker takes care of calling it if the client code asks for it.

          Mark: viv's got all weekend off, so i'm probably not going to have time to look at this again for 4 or 5 days, if you want to take a stab at merging the patches thta would be seriously awesome.

          Show
          Hoss Man added a comment - This is a complete overhaul of the internals of FieldCacheSanityChecker, and it's API so that it's a lot cleaner and easier to use programaticly. This also makes it easier for tests to run an analsis, and then ignore the types of errors they "expect" without ignoring whole cache entries (so a test that expects to have subreader problems can ignore those, even if one of those cache entires also fails a sanity check with another entry for a different reason (ie: different parser on same reader) And in the long run: this should make it easier for us to add new types of sanity checks (which i'm guessing we'll think of when the internals get overhauled) without changing the API too much. NOTE: This is based on Miller's attachment #12414960 (29/Jul/09) ... i haven't merged in or looked at any of the changes he made after that. i suspect the only overlap (if any) is how CacheEntry uses the Ram Estimation code ... i switched to having estimateSize(RamUsageEstimator) cache the value and then toString uses it if it's there ... the FieldCacheSanityChecker takes care of calling it if the client code asks for it. Mark: viv's got all weekend off, so i'm probably not going to have time to look at this again for 4 or 5 days, if you want to take a stab at merging the patches thta would be seriously awesome.
          Hide
          Hoss Man added a comment -

          Quick responses to some other comments...

          I chose a different route after considering both

          i trust you to make the right call, just thought i'd point it out in case you hadn't though of it.

          If you push explain to the sub readers, you will get why it scored as it did for each subreader - not one top level explain

          I don't really follow you on this (i need to take a look at your proposed fix) .. i'm not suggesting we push the whole explain down to the subreader, just that when the explain method wants to get hte FieldCache value for a doc, it should fetch the FieldCache for the SegmentReader the doc is in – so it gets the exact same value (and same FieldCache entry) as the scoring code did when it scores the document. (or maybe i'm completley missunderstanding how these classes were reimplimented to use segment based field caches)

          This issue was a fantastic idea by the way!

          yeah ... i was pretty out of the loop on all the "push sorting down into the segment" discussion, but when i noticed yonik pointing out all the ways solr's fieldcache usage was going to explode if we didn't change it it occured to me that this would probably be a big problem for anyone doing non-trivial stuff with Lucene, so it would be nice to have a way to toruble shoot it (i also had very little faith in Lucene-Java's test coverage since we only have unit tests that verify "correct" behavior when we make changes – but nothing sanity checks how that behavior happened (at least: not untill now)

          Show
          Hoss Man added a comment - Quick responses to some other comments... I chose a different route after considering both i trust you to make the right call, just thought i'd point it out in case you hadn't though of it. If you push explain to the sub readers, you will get why it scored as it did for each subreader - not one top level explain I don't really follow you on this (i need to take a look at your proposed fix) .. i'm not suggesting we push the whole explain down to the subreader, just that when the explain method wants to get hte FieldCache value for a doc, it should fetch the FieldCache for the SegmentReader the doc is in – so it gets the exact same value (and same FieldCache entry) as the scoring code did when it scores the document. (or maybe i'm completley missunderstanding how these classes were reimplimented to use segment based field caches) This issue was a fantastic idea by the way! yeah ... i was pretty out of the loop on all the "push sorting down into the segment" discussion, but when i noticed yonik pointing out all the ways solr's fieldcache usage was going to explode if we didn't change it it occured to me that this would probably be a big problem for anyone doing non-trivial stuff with Lucene, so it would be nice to have a way to toruble shoot it (i also had very little faith in Lucene-Java's test coverage since we only have unit tests that verify "correct" behavior when we make changes – but nothing sanity checks how that behavior happened (at least: not untill now)
          Hide
          Mark Miller added a comment -

          I don't really follow you on this (i need to take a look at your proposed fix) .. i'm not suggesting we push the whole explain down to the subreader, just that when the explain method wants to get hte FieldCache value for a doc, it should fetch the FieldCache for the SegmentReader the doc is in - so it gets the exact same value (and same FieldCache entry) as the scoring code did when it scores the document. (or maybe i'm completley missunderstanding how these classes were reimplimented to use segment based field caches)

          The way the per segment stuff went, we don't push down to the sub readers for the fieldcache per say - we just search each sub reader separately - so per reader fieldcache is just kind of a side effect. Then the top level reader is still used for things like stats and explain.

          I switched the explain for the offending stuff (custom/valuesource) to use a DocValues class that does push down to each subreader for the fieldcache though (while everything else still uses the top reader) - its in the scorer, so I added a switch to push down to subreaders for fieldcache access or not - only explain pushes down, while regular scoring doesn't (regular scoring will be working per sub reader anyway, because they are searched one at a time).

          I can merge up the patches.

          Show
          Mark Miller added a comment - I don't really follow you on this (i need to take a look at your proposed fix) .. i'm not suggesting we push the whole explain down to the subreader, just that when the explain method wants to get hte FieldCache value for a doc, it should fetch the FieldCache for the SegmentReader the doc is in - so it gets the exact same value (and same FieldCache entry) as the scoring code did when it scores the document. (or maybe i'm completley missunderstanding how these classes were reimplimented to use segment based field caches) The way the per segment stuff went, we don't push down to the sub readers for the fieldcache per say - we just search each sub reader separately - so per reader fieldcache is just kind of a side effect. Then the top level reader is still used for things like stats and explain. I switched the explain for the offending stuff (custom/valuesource) to use a DocValues class that does push down to each subreader for the fieldcache though (while everything else still uses the top reader) - its in the scorer, so I added a switch to push down to subreaders for fieldcache access or not - only explain pushes down, while regular scoring doesn't (regular scoring will be working per sub reader anyway, because they are searched one at a time). I can merge up the patches.
          Hide
          Mark Miller added a comment -

          the explain method wants to get hte FieldCache value for a doc, it should fetch the FieldCache for the SegmentReader the doc is in

          One more note to try and be a bit more clear:

          First, I wasn't sure how easy this was to do because I don't know explain code or the function package very well at all (eg I've never used it). And the explain method itself did not grab values from the field cache, it loaded up a scorer that did so. So I just wasn't sure how doable this fix was. Thats why I was saying pushing down explain to the subreader wasn't great, but I wasn't sure what else you could do. The fix turned out to be fairly easy though - the scorer for valuesource just needed two modes - one for normal scoring and one for explain (that breaks up the requests for a fieldcache val per sub reader) - the explain method would work for both ways, but no reason to try and break down per reader when its going to score per reader anyway, so I have both. Standard scorer for valuesource works as it did, and explain trips a setting to break out subreaders and distrib fieldcache requests. And then the custom query needed a tweak to work right (flip that setting) with its underlying valuesource queries.

          Show
          Mark Miller added a comment - the explain method wants to get hte FieldCache value for a doc, it should fetch the FieldCache for the SegmentReader the doc is in One more note to try and be a bit more clear: First, I wasn't sure how easy this was to do because I don't know explain code or the function package very well at all (eg I've never used it). And the explain method itself did not grab values from the field cache, it loaded up a scorer that did so. So I just wasn't sure how doable this fix was. Thats why I was saying pushing down explain to the subreader wasn't great, but I wasn't sure what else you could do. The fix turned out to be fairly easy though - the scorer for valuesource just needed two modes - one for normal scoring and one for explain (that breaks up the requests for a fieldcache val per sub reader) - the explain method would work for both ways, but no reason to try and break down per reader when its going to score per reader anyway, so I have both. Standard scorer for valuesource works as it did, and explain trips a setting to break out subreaders and distrib fieldcache requests. And then the custom query needed a tweak to work right (flip that setting) with its underlying valuesource queries.
          Hide
          Yonik Seeley added a comment -

          I believe that ConstantScoreQuery will need it's explain() fixed too?

          Show
          Yonik Seeley added a comment - I believe that ConstantScoreQuery will need it's explain() fixed too?
          Hide
          Mark Miller added a comment - - edited

          In the case that it is a caching filter? I hadn't actually looked to see if there are any other FieldCache ones either - just what tripped the tests.

          I guess it could be dealt with the same way? A DocIdSet that distributes to sub readers ...

          Show
          Mark Miller added a comment - - edited In the case that it is a caching filter? I hadn't actually looked to see if there are any other FieldCache ones either - just what tripped the tests. I guess it could be dealt with the same way? A DocIdSet that distributes to sub readers ...
          Hide
          Mark Miller added a comment -
          • merged patches (and little tweaks to explain fix code)
          • added a MultiDocIdSet for handling constantscorequery explain - first draft - needs some thought. I just banged it out. Its got a couple simple tests.
          Show
          Mark Miller added a comment - merged patches (and little tweaks to explain fix code) added a MultiDocIdSet for handling constantscorequery explain - first draft - needs some thought. I just banged it out. Its got a couple simple tests.
          Hide
          Mark Miller added a comment -

          Already finding some corner cases with the multi docidset stuff - I'll keep working along those lines a bit, then maybe look at some of the code you have been working on and post another patch later this weekend.

          Show
          Mark Miller added a comment - Already finding some corner cases with the multi docidset stuff - I'll keep working along those lines a bit, then maybe look at some of the code you have been working on and post another patch later this weekend.
          Hide
          Mark Miller added a comment -

          In the insanity check, when you drop into the sequential subreaders - I think its got to be recursive - you might have a multi at the top with other subs, or any combo thereof. I can add to next patch.

          Show
          Mark Miller added a comment - In the insanity check, when you drop into the sequential subreaders - I think its got to be recursive - you might have a multi at the top with other subs, or any combo thereof. I can add to next patch.
          Hide
          Michael McCandless added a comment -

          This was an excellent idea, and it's great that it uncovered some
          dangerous and very unexpected places where we are passing top-level
          reader to the FieldCache (eg that explain() could suddenly populate
          the FieldCache w/ top-level values is quite shocking!).

          ReaderUtil.subSearcher is doing the same thing as
          DirectoryReader.readerIndex.

          I love the RAMUsageEstimator... we have other places that estimate RAM
          (eg IndexWriter does so for added & deleted docs) that we should
          eventually cutover to this new API.

          I particularly love the new class named Insanity:

            public static Insanity[] checkSanity(FieldCache cache)
          

          MultiDocIdSet/Iterator makes me a bit nervous, because it's further
          "propogating" a non-segment-based iterator deeper into Lucene than I
          think we want to. It's similar to eg using
          DirectoryReader.MultiTermDocs (what Lucene used to do), instead of
          stepping through the segments yourself.

          Also, shouldn't explain most closely match what was done during
          searching (ie, run "per segment")? So simply pushing explain down to
          the sub-reader that has the doc seems appropriate? Ie we want it to
          share as much of the code path as possible with how searching was in
          fact done?

          EG for ConstantScoreQuery.explain, it seems like we should 1) locate
          the sub-reader that this doc falls in, and 2) get a scorer against
          that reader, then 3) build up the explanation from that? Likewise for
          CustomScoreQuery?

          In fact.... maybe we should simply fix IndexSearcher.explain to do
          this for all queries? Ie, get the top-level weight, locate sub-reader
          that has the doc, un-base the doc, and then invoke QueryWeight.explain
          with that sub-reader and un-based doc? Then we don't have to do
          anything special for each query. I think QueryWeight.scorer()
          shouldn't be expected to handle a "top level reader" being passed in.
          Ie, higher up in Lucene we should do that switch, so that we don't
          have to do it (this "valuesFromSubReaders" arg) for every scorer.

          Hmm: why do we even have explain at both the QueryWeight and Scorer
          "levels"? It seems like we should pick one level and do it there,
          consistently. Most queries seem to only implement the QueryWeight one
          and often simply throw UOE in the Scorer's explain, but eg PhraseQuery
          implements in both places.

          (BTW: I'll be offline for approx the next 36 hours or so!)

          Show
          Michael McCandless added a comment - This was an excellent idea, and it's great that it uncovered some dangerous and very unexpected places where we are passing top-level reader to the FieldCache (eg that explain() could suddenly populate the FieldCache w/ top-level values is quite shocking!). ReaderUtil.subSearcher is doing the same thing as DirectoryReader.readerIndex. I love the RAMUsageEstimator... we have other places that estimate RAM (eg IndexWriter does so for added & deleted docs) that we should eventually cutover to this new API. I particularly love the new class named Insanity: public static Insanity[] checkSanity(FieldCache cache) MultiDocIdSet/Iterator makes me a bit nervous, because it's further "propogating" a non-segment-based iterator deeper into Lucene than I think we want to. It's similar to eg using DirectoryReader.MultiTermDocs (what Lucene used to do), instead of stepping through the segments yourself. Also, shouldn't explain most closely match what was done during searching (ie, run "per segment")? So simply pushing explain down to the sub-reader that has the doc seems appropriate? Ie we want it to share as much of the code path as possible with how searching was in fact done? EG for ConstantScoreQuery.explain, it seems like we should 1) locate the sub-reader that this doc falls in, and 2) get a scorer against that reader, then 3) build up the explanation from that? Likewise for CustomScoreQuery? In fact.... maybe we should simply fix IndexSearcher.explain to do this for all queries? Ie, get the top-level weight, locate sub-reader that has the doc, un-base the doc, and then invoke QueryWeight.explain with that sub-reader and un-based doc? Then we don't have to do anything special for each query. I think QueryWeight.scorer() shouldn't be expected to handle a "top level reader" being passed in. Ie, higher up in Lucene we should do that switch, so that we don't have to do it (this "valuesFromSubReaders" arg) for every scorer. Hmm: why do we even have explain at both the QueryWeight and Scorer "levels"? It seems like we should pick one level and do it there, consistently. Most queries seem to only implement the QueryWeight one and often simply throw UOE in the Scorer's explain, but eg PhraseQuery implements in both places. (BTW: I'll be offline for approx the next 36 hours or so!)
          Hide
          Yonik Seeley added a comment -

          In fact.... maybe we should simply fix IndexSearcher.explain to do this for all queries?

          That was my first thought... but it would probably break more than it helped right now (by exposing more limitations) - for example idf in TermWeight.explain()

          Show
          Yonik Seeley added a comment - In fact.... maybe we should simply fix IndexSearcher.explain to do this for all queries? That was my first thought... but it would probably break more than it helped right now (by exposing more limitations) - for example idf in TermWeight.explain()
          Hide
          Mark Miller added a comment -

          bq . Ie we want it to share as much of the code path as possible with how searching was in fact done

          Well of course I was a bit hazy on explain, so for some reason I had it in my head that you would have to combine the explanations from multiple subreaders - but of course its a doc at a time, so the doc will only come from one subreader, and the sim/weight will be top level. So easy peasy fix. That boolean valuesFromSubReaders def had some code smell - just didn't have an alternative at the moment - fix then improve !

          I'll leave the 'explain at multiple levels' for another issue - I havn't even started thinking about this issue yet - I prefer to code Which is kind of an oxymoron.

          i don't have the code in front of me, but i thought i was adding the sub
          readers to the list it's iterating over, so it will eventually recurse all
          the way to the bottom.

          Ah right, sorry about the false alarm. One of the few times I've seen .size() in a for loop where its actually needed

          Show
          Mark Miller added a comment - bq . Ie we want it to share as much of the code path as possible with how searching was in fact done Well of course I was a bit hazy on explain, so for some reason I had it in my head that you would have to combine the explanations from multiple subreaders - but of course its a doc at a time, so the doc will only come from one subreader, and the sim/weight will be top level. So easy peasy fix. That boolean valuesFromSubReaders def had some code smell - just didn't have an alternative at the moment - fix then improve ! I'll leave the 'explain at multiple levels' for another issue - I havn't even started thinking about this issue yet - I prefer to code Which is kind of an oxymoron. i don't have the code in front of me, but i thought i was adding the sub readers to the list it's iterating over, so it will eventually recurse all the way to the bottom. Ah right, sorry about the false alarm. One of the few times I've seen .size() in a for loop where its actually needed
          Hide
          Mark Miller added a comment -

          changes to just go per reader for each doc - and a couple other unrelated tiny tweaks.

          Show
          Mark Miller added a comment - changes to just go per reader for each doc - and a couple other unrelated tiny tweaks.
          Hide
          Mark Miller added a comment -

          Patch cleanup - more suitable for browsing.

          Show
          Mark Miller added a comment - Patch cleanup - more suitable for browsing.
          Hide
          Mark Miller added a comment -

          I was a bit hazy on explain, so for some reason I had it in my head that you would have to combine the explanations from multiple subreaders

          but it would probably break more than it helped right now (by exposing more limitations) - for example idf in TermWeight.explain()

          To be a little more clear - this was originally why I went the direction I did - I assumed the reader was being used for stats that needed to come from the top level reader. Gut reaction seeing it go into scorer. I hadn't really checked that, at least for these queries, that wasn't the case - they just use it for the filter/fieldcache.

          Show
          Mark Miller added a comment - I was a bit hazy on explain, so for some reason I had it in my head that you would have to combine the explanations from multiple subreaders but it would probably break more than it helped right now (by exposing more limitations) - for example idf in TermWeight.explain() To be a little more clear - this was originally why I went the direction I did - I assumed the reader was being used for stats that needed to come from the top level reader. Gut reaction seeing it go into scorer. I hadn't really checked that, at least for these queries, that wasn't the case - they just use it for the filter/fieldcache.
          Hide
          Michael McCandless added a comment -

          That was my first thought... but it would probably break more than it helped right now (by exposing more limitations) - for example idf in TermWeight.explain()

          Ugh, you're right.

          I think It shouldn't be doing that? Ie, a Weight instance should
          "capture" all stats needed from the top-level searcher, on creation,
          and then when we ask for a scorer or an explain (or other future
          things that take an IndexReader) we should always pass in a single
          segment reader. This way we don't have to duplicate the "go find the
          right sub-reader" in many places.

          It's interesting that we didn't (I think?) have a similar problem w/
          scorer when we switched to passing it the sub-reader.

          I'll leave the 'explain at multiple levels' for another issue

          It looks like it's up to each query, which level does what.
          IndexSearcher's explain always calls Weight.explain, but then some
          Query impls (eg BooleanQuery) do everything in Weight.explain, while
          others (eg TermQuery, PhraseQuery) do some work in Weight.explain and
          some in the scorer.

          I guess this makes sense: "atomic" Queries (TermQuery, PhraseQuery)
          will need to fire up a scorer since there's real work to be done to
          see the specifics of how that doc was matched. Whereas BooleanQuery simply
          "glues" together other queries so it doesn't need to forward to its
          [many] scorers.

          So the only odd thing is why explain is part of Scorer base
          class... seems like the method could/should live "privately" to only
          those queries that need it.

          But I agree let's leave this be for now...

          Show
          Michael McCandless added a comment - That was my first thought... but it would probably break more than it helped right now (by exposing more limitations) - for example idf in TermWeight.explain() Ugh, you're right. I think It shouldn't be doing that? Ie, a Weight instance should "capture" all stats needed from the top-level searcher, on creation, and then when we ask for a scorer or an explain (or other future things that take an IndexReader) we should always pass in a single segment reader. This way we don't have to duplicate the "go find the right sub-reader" in many places. It's interesting that we didn't (I think?) have a similar problem w/ scorer when we switched to passing it the sub-reader. I'll leave the 'explain at multiple levels' for another issue It looks like it's up to each query, which level does what. IndexSearcher's explain always calls Weight.explain, but then some Query impls (eg BooleanQuery) do everything in Weight.explain, while others (eg TermQuery, PhraseQuery) do some work in Weight.explain and some in the scorer. I guess this makes sense: "atomic" Queries (TermQuery, PhraseQuery) will need to fire up a scorer since there's real work to be done to see the specifics of how that doc was matched. Whereas BooleanQuery simply "glues" together other queries so it doesn't need to forward to its [many] scorers. So the only odd thing is why explain is part of Scorer base class... seems like the method could/should live "privately" to only those queries that need it. But I agree let's leave this be for now...
          Hide
          Yonik Seeley added a comment -

          It's interesting that we didn't (I think?) have a similar problem w/ scorer when we switched to passing it the sub-reader.

          Right - that code was well tested and exercised via MultiSearcher in the past (all idf values had to come from Weight to avoid getting idfs per sub-searcher).
          One thing that's missing for explain() is that there is no way to get "df" as opposed to "idf" from the Weight.

          So the only odd thing is why explain is part of Scorer base class

          Right.. it doesn't belong there. Perhaps deprecate and remove from the Scorer base in 3.0? (since one can't reliably call it now anyway).

          Show
          Yonik Seeley added a comment - It's interesting that we didn't (I think?) have a similar problem w/ scorer when we switched to passing it the sub-reader. Right - that code was well tested and exercised via MultiSearcher in the past (all idf values had to come from Weight to avoid getting idfs per sub-searcher). One thing that's missing for explain() is that there is no way to get "df" as opposed to "idf" from the Weight. So the only odd thing is why explain is part of Scorer base class Right.. it doesn't belong there. Perhaps deprecate and remove from the Scorer base in 3.0? (since one can't reliably call it now anyway).
          Hide
          Mark Miller added a comment -

          I've got one more draft here with the smallest of tweaks - javadoc spelling errors, and one perhaps one or two other tiny things - stuff I just would toss out rather than merge - but are you doing anything here right now Hoss? I think not at the moment, so if thats the case I'll put up one more patch before you grab the conch back. Otherwise I'll hold off on anything till you put something up.

          Show
          Mark Miller added a comment - I've got one more draft here with the smallest of tweaks - javadoc spelling errors, and one perhaps one or two other tiny things - stuff I just would toss out rather than merge - but are you doing anything here right now Hoss? I think not at the moment, so if thats the case I'll put up one more patch before you grab the conch back. Otherwise I'll hold off on anything till you put something up.
          Hide
          Chris Hostetter added a comment -

          : I've got one more draft here with the smallest of tweaks - javadoc
          : spelling errors, and one perhaps one or two other tiny things - stuff I
          : just would toss out rather than merge - but are you doing anything here
          : right now Hoss? I think not at the moment, so if thats the case I'll put
          : up one more patch before you grab the conch back. Otherwise I'll hold
          : off on anything till you put something up.

          you have the conch ... i haven't worked on anything related to this issue
          since my last patch.

          i'll try to look at it again tomorow.

          -Hoss

          Show
          Chris Hostetter added a comment - : I've got one more draft here with the smallest of tweaks - javadoc : spelling errors, and one perhaps one or two other tiny things - stuff I : just would toss out rather than merge - but are you doing anything here : right now Hoss? I think not at the moment, so if thats the case I'll put : up one more patch before you grab the conch back. Otherwise I'll hold : off on anything till you put something up. you have the conch ... i haven't worked on anything related to this issue since my last patch. i'll try to look at it again tomorow. -Hoss
          Hide
          Michael McCandless added a comment -

          Right - that code was well tested and exercised via MultiSearcher in the past (all idf values had to come from Weight to avoid getting idfs per sub-searcher).

          Ahh right.

          One thing that's missing for explain() is that there is no way to get "df" as opposed to "idf" from the Weight.

          But this only affects the "atomic" queries, right? So eg TermWeight
          could simply hold onto this value and then use it during explain.
          Hmm... though TermQuery's ctor doesn't get the df directly, because it
          calls similarity.idf(term, searcher). I don't really like making a
          separate additional call to docFreq.

          How about, for queries that need to go and look up docFreq, their
          QueryWeight impls simply hold onto the [top-level] IndexSearcher that
          had been passed to their ctor, and then do the docFreq call against
          that, if explain is invoked?

          Right.. it doesn't belong there. Perhaps deprecate and remove from the Scorer base in 3.0? (since one can't reliably call it now anyway).

          +1

          Hoss/Mark do you want to fold it in to the patch, here? Or I can open
          a new issue?

          Show
          Michael McCandless added a comment - Right - that code was well tested and exercised via MultiSearcher in the past (all idf values had to come from Weight to avoid getting idfs per sub-searcher). Ahh right. One thing that's missing for explain() is that there is no way to get "df" as opposed to "idf" from the Weight. But this only affects the "atomic" queries, right? So eg TermWeight could simply hold onto this value and then use it during explain. Hmm... though TermQuery's ctor doesn't get the df directly, because it calls similarity.idf(term, searcher). I don't really like making a separate additional call to docFreq. How about, for queries that need to go and look up docFreq, their QueryWeight impls simply hold onto the [top-level] IndexSearcher that had been passed to their ctor, and then do the docFreq call against that, if explain is invoked? Right.. it doesn't belong there. Perhaps deprecate and remove from the Scorer base in 3.0? (since one can't reliably call it now anyway). +1 Hoss/Mark do you want to fold it in to the patch, here? Or I can open a new issue?
          Hide
          Yonik Seeley added a comment -

          How about, for queries that need to go and look up docFreq, their QueryWeight impls simply hold onto the [top-level] IndexSearcher that
          had been passed to their ctor, and then do the docFreq call against
          that, if explain is invoked?

          Asking the searcher for the docFreq is the right thing to do... but people who rely on Weight being serializable might be in for a nasty surprise.
          Of course... one might wonder if we should bother supporting serializable in Lucene longer term at all - anyone dealing with distributed systems has found it to have too many shortcomings anyway.

          Show
          Yonik Seeley added a comment - How about, for queries that need to go and look up docFreq, their QueryWeight impls simply hold onto the [top-level] IndexSearcher that had been passed to their ctor, and then do the docFreq call against that, if explain is invoked? Asking the searcher for the docFreq is the right thing to do... but people who rely on Weight being serializable might be in for a nasty surprise. Of course... one might wonder if we should bother supporting serializable in Lucene longer term at all - anyone dealing with distributed systems has found it to have too many shortcomings anyway.
          Hide
          Hoss Man added a comment -

          General Comments on mark's latest patch...

          • the changes that i understand all seem good ... some of the details in reader/searcher/query internals elude me but it sounds Yonik & McCandless have their eyes on them so i trust the three of you have it covered.
          • we still need to fill in some empty/sparse javadocs, but that can be done after an initial commit.
          • is it a bug that AverageGuessMemoryModel.getSize() will NPE on a non primitive class ... or should/will the docs for that API say it only works on primitives?

          Big Questions I Still Have....

          • does anyone have any reservations about the new APIs introduced?
          • FieldCache.CreationPlaceholder (promoted from FieldCacheImpl)
          • FieldCache.CacheEntry
          • FieldCache.getCacheEntries()
          • FieldCache.purgeAllCaches()
          • FieldCacheSanityChecker
          • RamUsageEstimator
          • does anyone have any reservations about the refactoring done in FieldCacheImpl to make this new API possible? (ie: did i break the thread safety in a way i'm not noticing?)
          • is the FieldCacheImpl.Entry.type (the "SortField" int type) still needed by FieldCacheImpl.Entry? ... nothing seems to use it so it would be nice to eliminate it and simplify the CacheEntry API as well. (i suspect it got refactored into obsolescence when the Sorting got moved into the subreaders)
          • The sanity checking ignores CreationPlaceholder – largely because of the way the numeric caches first try one parser, and then if they get an NFE try a different parser – but this leaves the CreationPlaceholder in the cache. It's not a big object, so i assume it was implemented this way on purpose and the sanity checker is doing the correct thing by ignoring it, but i wanted to make sure people are aware/ok with this behavior.

          Lastly: This patch feels unnecessarily large at this point. Several of the bugs/improvements we've uncovered here don't seem like belong in this patch, and should be tracked in separate Jira issues, which can be committed independently and enumerated in CHANGES.txt....

          • new ReaderUtil and the usage in DirectoryReader, MultiReader, MultiSearcher & IndexSearcher
          • explain subreader bug fixes in ConstantScoreQuery, QueryWeight, ValueSourceQuery, CustomScoreQuery, etc...
            ...i think this issue (and this patch) should be reduced to just the new sanity checkig API, and tests that have been changed to be more sane (where the underlying code was already fine)

          Mark: would you mind splitting up the latest patch you have (you mentioned some additional minor tweaks) and opening new issues for these peripheral changes and then attaching back what's left for this patch. Then I'll take the conch back and work on the missing javadocs.

          (I'll happily commit once i get at least one thumbs up from someone on the "Big Questions" above ... we can always tweak the javadocs further in subsequent commits)

          Show
          Hoss Man added a comment - General Comments on mark's latest patch... the changes that i understand all seem good ... some of the details in reader/searcher/query internals elude me but it sounds Yonik & McCandless have their eyes on them so i trust the three of you have it covered. we still need to fill in some empty/sparse javadocs, but that can be done after an initial commit. is it a bug that AverageGuessMemoryModel.getSize() will NPE on a non primitive class ... or should/will the docs for that API say it only works on primitives? Big Questions I Still Have.... does anyone have any reservations about the new APIs introduced? FieldCache.CreationPlaceholder (promoted from FieldCacheImpl) FieldCache.CacheEntry FieldCache.getCacheEntries() FieldCache.purgeAllCaches() FieldCacheSanityChecker RamUsageEstimator does anyone have any reservations about the refactoring done in FieldCacheImpl to make this new API possible? (ie: did i break the thread safety in a way i'm not noticing?) is the FieldCacheImpl.Entry.type (the "SortField" int type) still needed by FieldCacheImpl.Entry? ... nothing seems to use it so it would be nice to eliminate it and simplify the CacheEntry API as well. (i suspect it got refactored into obsolescence when the Sorting got moved into the subreaders) The sanity checking ignores CreationPlaceholder – largely because of the way the numeric caches first try one parser, and then if they get an NFE try a different parser – but this leaves the CreationPlaceholder in the cache. It's not a big object, so i assume it was implemented this way on purpose and the sanity checker is doing the correct thing by ignoring it, but i wanted to make sure people are aware/ok with this behavior. Lastly: This patch feels unnecessarily large at this point. Several of the bugs/improvements we've uncovered here don't seem like belong in this patch, and should be tracked in separate Jira issues, which can be committed independently and enumerated in CHANGES.txt.... new ReaderUtil and the usage in DirectoryReader, MultiReader, MultiSearcher & IndexSearcher explain subreader bug fixes in ConstantScoreQuery, QueryWeight, ValueSourceQuery, CustomScoreQuery, etc... ...i think this issue (and this patch) should be reduced to just the new sanity checkig API, and tests that have been changed to be more sane (where the underlying code was already fine) Mark: would you mind splitting up the latest patch you have (you mentioned some additional minor tweaks) and opening new issues for these peripheral changes and then attaching back what's left for this patch. Then I'll take the conch back and work on the missing javadocs. (I'll happily commit once i get at least one thumbs up from someone on the "Big Questions" above ... we can always tweak the javadocs further in subsequent commits)
          Hide
          Hoss Man added a comment -

          Hoss/Mark do you want to fold it in to the patch, here? Or I can open a new issue?

          as i alluded to above, i'm in favor of individual issues for each "bug" uncovered by this issue so they can be tracked separately.

          Show
          Hoss Man added a comment - Hoss/Mark do you want to fold it in to the patch, here? Or I can open a new issue? as i alluded to above, i'm in favor of individual issues for each "bug" uncovered by this issue so they can be tracked separately.
          Hide
          Mark Miller added a comment -

          Mark: would you mind splitting up the latest patch you have (you mentioned some additional minor tweaks) and opening new issues for these peripheral changes and then attaching back what's left for this patch.

          I've already got separate issues and patches up were it makes sense (not the last one Mike mentions) - I wanted to keep them here too though until the insanity tests were complete - the tests that the fixes are somewhat correct are in this patch, and I don't like to manage layers of patches. if we don't plan on doing anymore with the insanity tests here though, I'll spin them out of this patch now.

          I'll put up one more version and then you can have it back.

          Show
          Mark Miller added a comment - Mark: would you mind splitting up the latest patch you have (you mentioned some additional minor tweaks) and opening new issues for these peripheral changes and then attaching back what's left for this patch. I've already got separate issues and patches up were it makes sense (not the last one Mike mentions) - I wanted to keep them here too though until the insanity tests were complete - the tests that the fixes are somewhat correct are in this patch, and I don't like to manage layers of patches. if we don't plan on doing anymore with the insanity tests here though, I'll spin them out of this patch now. I'll put up one more version and then you can have it back.
          Hide
          Yonik Seeley added a comment -

          Asking the searcher for the docFreq is the right thing to do... but people who rely on Weight being serializable might be in for a nasty surprise.

          If we decide not to ask weight to hang onto it's searcher, then the other way to do it right is to change explain() to accept a Searcher as well as a IndexReader.

          Show
          Yonik Seeley added a comment - Asking the searcher for the docFreq is the right thing to do... but people who rely on Weight being serializable might be in for a nasty surprise. If we decide not to ask weight to hang onto it's searcher, then the other way to do it right is to change explain() to accept a Searcher as well as a IndexReader.
          Hide
          Mark Miller added a comment -

          is it a bug that AverageGuessMemoryModel.getSize() will NPE on a non primitive class ... or should/will the docs for that API say it only works on primitives?

          Its only meant to work with primitives. I'll change the name to getPrimitiveSize - on my last pass through, I'll also review the javadoc for the classes I added.

          Show
          Mark Miller added a comment - is it a bug that AverageGuessMemoryModel.getSize() will NPE on a non primitive class ... or should/will the docs for that API say it only works on primitives? Its only meant to work with primitives. I'll change the name to getPrimitiveSize - on my last pass through, I'll also review the javadoc for the classes I added.
          Hide
          Michael McCandless added a comment -

          Asking the searcher for the docFreq is the right thing to do... but people who rely on Weight being serializable might be in for a nasty surprise.

          Argh, right.

          If we decide not to ask weight to hang onto it's searcher, then the other way to do it right is to change explain() to accept a Searcher as well as a IndexReader.

          +1

          Of course... one might wonder if we should bother supporting serializable in Lucene longer term at all - anyone dealing with distributed systems has found it to have too many shortcomings anyway.

          Yeah this was never really "settled" in LUCENE-1473. Lucene currently
          supports live serialization, but not cross-version
          serialization... and we have moved RemoteSearchable to contrib and
          removed RMI from Searchable.

          Does Solr ever rely on Lucene's "implements Serializable"?

          Show
          Michael McCandless added a comment - Asking the searcher for the docFreq is the right thing to do... but people who rely on Weight being serializable might be in for a nasty surprise. Argh, right. If we decide not to ask weight to hang onto it's searcher, then the other way to do it right is to change explain() to accept a Searcher as well as a IndexReader. +1 Of course... one might wonder if we should bother supporting serializable in Lucene longer term at all - anyone dealing with distributed systems has found it to have too many shortcomings anyway. Yeah this was never really "settled" in LUCENE-1473 . Lucene currently supports live serialization, but not cross-version serialization... and we have moved RemoteSearchable to contrib and removed RMI from Searchable. Does Solr ever rely on Lucene's "implements Serializable"?
          Hide
          Michael McCandless added a comment -

          Hoss/Mark do you want to fold it in to the patch, here? Or I can open a new issue?

          as i alluded to above, i'm in favor of individual issues for each "bug" uncovered by this issue so they can be tracked separately.

          OK I'll open a new issue for this one (deprecate Scorer.explain).

          Show
          Michael McCandless added a comment - Hoss/Mark do you want to fold it in to the patch, here? Or I can open a new issue? as i alluded to above, i'm in favor of individual issues for each "bug" uncovered by this issue so they can be tracked separately. OK I'll open a new issue for this one (deprecate Scorer.explain).
          Hide
          Yonik Seeley added a comment -

          Does Solr ever rely on Lucene's "implements Serializable"?

          Nope - itra-node communications use the same mechanism as clients... a generic data structures (Map,List,Document,etc) that has custom serialization to XML,JSON,Python,Ruby or Binary (and binary is now used by default).

          Show
          Yonik Seeley added a comment - Does Solr ever rely on Lucene's "implements Serializable"? Nope - itra-node communications use the same mechanism as clients... a generic data structures (Map,List,Document,etc) that has custom serialization to XML,JSON,Python,Ruby or Binary (and binary is now used by default).
          Hide
          Mark Miller added a comment -

          patch is coming soon - I've merged to trunk and pulled the separate issues - just want to look over some a bit later. Would have had this sooner, but Eclipse decided to start crashing every 5 minutes this morning because firefox brought in a new xulrunner and ... ugg - at least its not Windows ... coming though.

          Show
          Mark Miller added a comment - patch is coming soon - I've merged to trunk and pulled the separate issues - just want to look over some a bit later. Would have had this sooner, but Eclipse decided to start crashing every 5 minutes this morning because firefox brought in a new xulrunner and ... ugg - at least its not Windows ... coming though.
          Hide
          Mark Miller added a comment -

          I still havn't looked at this in the detail that I want to, but time is my enemy at the moment, so take it back and at least you can finish up what you have planned.

          Hopefully its all in good working order after all the extracting and to trunking - let me know if you see anything off and I'll spin another.

          Show
          Mark Miller added a comment - I still havn't looked at this in the detail that I want to, but time is my enemy at the moment, so take it back and at least you can finish up what you have planned. Hopefully its all in good working order after all the extracting and to trunking - let me know if you see anything off and I'll spin another.
          Hide
          Mark Miller added a comment -

          P.S. I'm not sure we want to go with the way I have changed the tests here.

          I switched things to go per subreader rather than use the overall reader - this is how things happen in IndexSearcher now. But we lose the top level reader test. We might want to do it both ways, and when doing it by top reader, ignore the triggered insanity check?

          Show
          Mark Miller added a comment - P.S. I'm not sure we want to go with the way I have changed the tests here. I switched things to go per subreader rather than use the overall reader - this is how things happen in IndexSearcher now. But we lose the top level reader test. We might want to do it both ways, and when doing it by top reader, ignore the triggered insanity check?
          Hide
          Hoss Man added a comment -

          Mark: I'll start working on improving the docs (and other things from my previous todo list)

          P.S. I'm not sure we want to go with the way I have changed the tests here.

          Are you talking about TestOrdValues and TestFieldScoreQuery ?

          if we expect OrdValues and FieldScoreQuery to use subReader based field caches, then the test seems to be doing the correct thing (in your patch) .. inspecting the fieldcaches per subreader. Is there a code path where we expect those methods to get called on a MultiReader?

          (Actually: that seems like a wroth while improvement to make to these classes: a MultiDocValues impl that all of the getValues(IndexReader) methods use when passed a MultiReader ... it uses getSequentialSubReaders to construct DocValue instances for each so you don't get FieldCache expolsions if code inadvertenly passes the wrong reader to getValues. What do you think? ... new issue?)

          Show
          Hoss Man added a comment - Mark: I'll start working on improving the docs (and other things from my previous todo list) P.S. I'm not sure we want to go with the way I have changed the tests here. Are you talking about TestOrdValues and TestFieldScoreQuery ? if we expect OrdValues and FieldScoreQuery to use subReader based field caches, then the test seems to be doing the correct thing (in your patch) .. inspecting the fieldcaches per subreader. Is there a code path where we expect those methods to get called on a MultiReader? (Actually: that seems like a wroth while improvement to make to these classes: a MultiDocValues impl that all of the getValues(IndexReader) methods use when passed a MultiReader ... it uses getSequentialSubReaders to construct DocValue instances for each so you don't get FieldCache expolsions if code inadvertenly passes the wrong reader to getValues. What do you think? ... new issue?)
          Hide
          Hoss Man added a comment -

          Hmmmm...

          actually mark, testing our your latest patch against hte trunk i'm seeing (FieldCache sanity) failures from TestCustomScoreQuery, TestFieldScoreQuery, TestOrdValues, and TestSort ... have you seen these? did some other recent change on the trunk trigger these?

          Show
          Hoss Man added a comment - Hmmmm... actually mark, testing our your latest patch against hte trunk i'm seeing (FieldCache sanity) failures from TestCustomScoreQuery, TestFieldScoreQuery, TestOrdValues, and TestSort ... have you seen these? did some other recent change on the trunk trigger these?
          Hide
          Mark Miller added a comment -

          I think that TestCustomScoreQuery, TestFieldScoreQuery, and TestOrdValues all fail because the fix for them is now in another issue.

          TestSort I didn't notice. It looks like its considering String[] and StringIndex the same for the two multi and parallel sort tests - merged to trunk, so perhaps something has gone awry there? I've looked over the patch and I don't see any obvious mistake - I don't know that I have time to dig more now, but since you are more familiar with that code anyway, perhaps you can tell me why its now considering them the same anyway? Otherwise I will look more before too long.

          Show
          Mark Miller added a comment - I think that TestCustomScoreQuery, TestFieldScoreQuery, and TestOrdValues all fail because the fix for them is now in another issue. TestSort I didn't notice. It looks like its considering String[] and StringIndex the same for the two multi and parallel sort tests - merged to trunk, so perhaps something has gone awry there? I've looked over the patch and I don't see any obvious mistake - I don't know that I have time to dig more now, but since you are more familiar with that code anyway, perhaps you can tell me why its now considering them the same anyway? Otherwise I will look more before too long.
          Hide
          Mark Miller added a comment -

          Here is the output - it appears to think String[] and StringIndex are both string:

          VALUEMISMATCH: Multiple distinct value objects for org.apache.lucene.index.CompoundFileReader$CSIndexInput@56d73c7a+string
          'org.apache.lucene.index.CompoundFileReader$CSIndexInput@56d73c7a'=>'string',class org.apache.lucene.search.FieldCache$StringIndex,3,null,null=>org.apache.lucene.search.FieldCache$StringIndex#279807577 (size =~ 152 bytes)
          'org.apache.lucene.index.CompoundFileReader$CSIndexInput@56d73c7a'=>'string',class java.lang.String,11,null,null=>[Ljava.lang.String;#647057258 (size =~ 108 bytes)

          Show
          Mark Miller added a comment - Here is the output - it appears to think String[] and StringIndex are both string: VALUEMISMATCH: Multiple distinct value objects for org.apache.lucene.index.CompoundFileReader$CSIndexInput@56d73c7a+string 'org.apache.lucene.index.CompoundFileReader$CSIndexInput@56d73c7a'=>'string',class org.apache.lucene.search.FieldCache$StringIndex,3,null,null=>org.apache.lucene.search.FieldCache$StringIndex#279807577 (size =~ 152 bytes) 'org.apache.lucene.index.CompoundFileReader$CSIndexInput@56d73c7a'=>'string',class java.lang.String,11,null,null=>[Ljava.lang.String;#647057258 (size =~ 108 bytes)
          Hide
          Mark Miller added a comment -

          (Actually: that seems like a wroth while improvement to make to these classes: a MultiDocValues impl that all of the getValues(IndexReader) methods use when passed a MultiReader ... it uses getSequentialSubReaders to construct DocValue instances for each so you don't get FieldCache expolsions if code inadvertenly passes the wrong reader to getValues. What do you think? ... new issue?)

          Very interesting idea - def a new issue I think. Not sure its worth it if you can't protect general fieldcache access as well though ...

          Show
          Mark Miller added a comment - (Actually: that seems like a wroth while improvement to make to these classes: a MultiDocValues impl that all of the getValues(IndexReader) methods use when passed a MultiReader ... it uses getSequentialSubReaders to construct DocValue instances for each so you don't get FieldCache expolsions if code inadvertenly passes the wrong reader to getValues. What do you think? ... new issue?) Very interesting idea - def a new issue I think. Not sure its worth it if you can't protect general fieldcache access as well though ...
          Hide
          Hoss Man added a comment -

          checkpoint: no functional change from mark's previous patch, just improved all the javadocs, including explanation of SanityCheckers purpose and experimental/expert warnings where appropriate.

          Show
          Hoss Man added a comment - checkpoint: no functional change from mark's previous patch, just improved all the javadocs, including explanation of SanityCheckers purpose and experimental/expert warnings where appropriate.
          Hide
          Mark Miller added a comment -

          Okay, sorry - I messed up when merging with trunk.

          In TestSort you had moved the local sorting to the bottom in the multi sort test - I kept that, but I also kept them higher up. So thats the fail - they just have to be removed.

          Line 953-957 it looks - sorry bout that - just didn't notice it fail with the other 3.

          Show
          Mark Miller added a comment - Okay, sorry - I messed up when merging with trunk. In TestSort you had moved the local sorting to the bottom in the multi sort test - I kept that, but I also kept them higher up. So thats the fail - they just have to be removed. Line 953-957 it looks - sorry bout that - just didn't notice it fail with the other 3.
          Hide
          Hoss Man added a comment -

          I think that TestCustomScoreQuery, TestFieldScoreQuery, and TestOrdValues all fail because the fix for them is now in another issue.

          ah ... are you talking about LUCENE-1771 ? (the jira dependency sems backwards in that case)

          In TestSort you had moved the local sorting to the bottom in the multi sort test - I kept that, but I also kept them higher up. So thats the fail - they just have to be removed.

          yeah .. i just caught that and was starting to reply ... the interestingthing is that the CacheEntry.toString() doesn't show the Local.US was used when getting the Strings[] FieldCache. .. i'm currently trying to figure out why (because that could confuse people as well)

          Show
          Hoss Man added a comment - I think that TestCustomScoreQuery, TestFieldScoreQuery, and TestOrdValues all fail because the fix for them is now in another issue. ah ... are you talking about LUCENE-1771 ? (the jira dependency sems backwards in that case) In TestSort you had moved the local sorting to the bottom in the multi sort test - I kept that, but I also kept them higher up. So thats the fail - they just have to be removed. yeah .. i just caught that and was starting to reply ... the interestingthing is that the CacheEntry.toString() doesn't show the Local.US was used when getting the Strings[] FieldCache. .. i'm currently trying to figure out why (because that could confuse people as well)
          Hide
          Mark Miller added a comment -

          Right - I set that up when the code was in this issue - reversed now!

          Show
          Mark Miller added a comment - Right - I set that up when the code was in this issue - reversed now!
          Hide
          Hoss Man added a comment -

          the interestingthing is that the CacheEntry.toString() doesn't show the Local.US was used when getting the Strings[] FieldCache

          I'm an idiot ... the Locale isn't used like a FieldCache Parser ... the same String[] is used regardless of the Localed, so it's never part of the CacheKey. the output is correct.

          revised patch fixes TestSort as mark pointed out, and updates some javadocs where i missleading suggested different Locales might trigger InsanityType.VALUEMISMATCH

          Show
          Hoss Man added a comment - the interestingthing is that the CacheEntry.toString() doesn't show the Local.US was used when getting the Strings[] FieldCache I'm an idiot ... the Locale isn't used like a FieldCache Parser ... the same String[] is used regardless of the Localed, so it's never part of the CacheKey. the output is correct. revised patch fixes TestSort as mark pointed out, and updates some javadocs where i missleading suggested different Locales might trigger InsanityType.VALUEMISMATCH
          Hide
          Hoss Man added a comment -

          confirmed that patch in LUCENE-1771 fixes the remaining broken tests in this issue.

          Show
          Hoss Man added a comment - confirmed that patch in LUCENE-1771 fixes the remaining broken tests in this issue.
          Hide
          Michael McCandless added a comment -

          Maybe we should simply print a warning, eg to System.err, on detecting that 2X RAM usage has occurred, pointing people to the sanity checker? We could eg do it once only so we don't spam the stderr logs...

          Show
          Michael McCandless added a comment - Maybe we should simply print a warning, eg to System.err, on detecting that 2X RAM usage has occurred, pointing people to the sanity checker? We could eg do it once only so we don't spam the stderr logs...
          Hide
          Hoss Man added a comment -

          Maybe we should simply print a warning, eg to System.err, on detecting that 2X RAM usage has occurred, pointing people to the sanity checker? We could eg do it once only so we don't spam the stderr logs

          I'm not really comfortable dumping anything to System.err without user requesting it ... but this is a really interesting idea. (I suppose we could add an infoStream type idea to FieldCache to expose this)

          FieldCacheImpl.Cache.get could use the FieldCacheSanityChecker to inspect itself immediately after calling createValue, and could even test if any of the Insanity instances returned are related to the current call (by comparing the CacheEntry with the Entry it's using) ... it could even log a useful stack trace since the sanity check would be happening in the same call stack as at least one of the CacheEntries in the Insanity object.

          I've opened LUCENE-1798 to track implmenting somehting like this once the FieldCacheSanityChecker gets committed.

          Show
          Hoss Man added a comment - Maybe we should simply print a warning, eg to System.err, on detecting that 2X RAM usage has occurred, pointing people to the sanity checker? We could eg do it once only so we don't spam the stderr logs I'm not really comfortable dumping anything to System.err without user requesting it ... but this is a really interesting idea. (I suppose we could add an infoStream type idea to FieldCache to expose this) FieldCacheImpl.Cache.get could use the FieldCacheSanityChecker to inspect itself immediately after calling createValue, and could even test if any of the Insanity instances returned are related to the current call (by comparing the CacheEntry with the Entry it's using) ... it could even log a useful stack trace since the sanity check would be happening in the same call stack as at least one of the CacheEntries in the Insanity object. I've opened LUCENE-1798 to track implmenting somehting like this once the FieldCacheSanityChecker gets committed.
          Hide
          Hoss Man added a comment -

          slightly revised patch based on java-dev@lucene discussion...

          the sortFieldTYpe and Locale portions of Cache.Entry are never used by FieldCache – just a deprecated class that abuses the Entry api out of lazyiness... so the CacheEntry debugging abstraction shouldn't expose them (but i left in code to manifest them in the toString() if they are atypical just in case). Also added some deprecation notices so we remember to remove them once they are no longer needed.

          Show
          Hoss Man added a comment - slightly revised patch based on java-dev@lucene discussion... the sortFieldTYpe and Locale portions of Cache.Entry are never used by FieldCache – just a deprecated class that abuses the Entry api out of lazyiness... so the CacheEntry debugging abstraction shouldn't expose them (but i left in code to manifest them in the toString() if they are atypical just in case). Also added some deprecation notices so we remember to remove them once they are no longer needed.
          Hide
          Hoss Man added a comment -

          updated patch to trunk (QueryWeight->Weight) and tweaked some FieldCacheImpl methods to use the non-deprecated Entry constructors (forgot that part before)

          I'll commit as soon as my test run is finished.

          Show
          Hoss Man added a comment - updated patch to trunk (QueryWeight->Weight) and tweaked some FieldCacheImpl methods to use the non-deprecated Entry constructors (forgot that part before) I'll commit as soon as my test run is finished.
          Hide
          Hoss Man added a comment -

          one last updated: the Locale.US asserts in TestRemoteSort had the same problem as TestSort, they were suppose to be moved, but instead they were just copied (not sure how i missed that before)

          Show
          Hoss Man added a comment - one last updated: the Locale.US asserts in TestRemoteSort had the same problem as TestSort, they were suppose to be moved, but instead they were just copied (not sure how i missed that before)
          Hide
          Hoss Man added a comment -

          Committed revision 803676.

          Show
          Hoss Man added a comment - Committed revision 803676.

            People

            • Assignee:
              Hoss Man
              Reporter:
              Hoss Man
            • Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development