Solr
  1. Solr
  2. SOLR-5122

spellcheck.collateMaxCollectDocs estimates seem to be meaninless -- can lead to "ArithmeticException: / by zero"

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.4
    • Fix Version/s: 4.5, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      As part of SOLR-4952 SpellCheckCollatorTest started using RandomMergePolicy, and this (aparently) led to a failure in testEstimatedHitCounts.

      As far as i can tell: the test assumes that specific values would be returned as the estimated "hits" for a colleation, and it appears that the change in MergePolicy however resulted in different segments with different term stats, causing the estimation code to produce different values then what is expected.

      I made a quick attempt to improve the test to:

      • expect explicit exact values only when spellcheck.collateMaxCollectDocs is set such that the "estimate' should actually be exact (ie: collateMaxCollectDocs == 0 or collateMaxCollectDocs greater then the num docs in the index
      • randomize the values used for collateMaxCollectDocs and confirm that the estimates are never more then the num docs in the index

      This lead to an odd "ArithmeticException: / by zero" error in the test, which seems to suggest that there is a genuine bug in the code for estimating the hits that only gets tickled in certain mergepolicy/segment/collateMaxCollectDocs combinations.

      Update: This appears to be a general problem with collecting docs out of order and the estimation of hits – i believe even if there is no divide by zero error, the estimates are largely meaningless since the docs are collected out of order.

      1. SOLR-5122.patch
        17 kB
        Hoss Man
      2. SOLR-5122.patch
        6 kB
        Hoss Man
      3. SOLR-5122.patch
        6 kB
        Hoss Man

        Issue Links

          Activity

          Hide
          Hoss Man added a comment -

          Attached patch applied to trunk r1511141 produces the following error...

             [junit4]   2> 6500 T10 C1 oasc.SolrCore.execute [collection1] webapp=null path=null params={spellcheck=true&spellcheck.dictionary=direct&spellcheck.count=1&spellcheck.collate=true&spellcheck.maxCollationTries=1&spellcheck.maxCollations=1&spellcheck.collateExtendedResults=true&qt=spellCheckCompRH&q=teststop%3Ametnoia&spellcheck.collateMaxCollectDocs=5} hits=0 status=500 QTime=25 
             [junit4]   2> 6501 T10 oasc.SolrException.log ERROR REQUEST FAILED: spellcheck=true&spellcheck.dictionary=direct&spellcheck.count=1&spellcheck.collate=true&spellcheck.maxCollationTries=1&spellcheck.maxCollations=1&spellcheck.collateExtendedResults=true&qt=spellCheckCompRH&q=teststop%3Ametnoia&spellcheck.collateMaxCollectDocs=5:java.lang.ArithmeticException: / by zero
             [junit4]   2> 		at org.apache.solr.spelling.SpellCheckCollator.collate(SpellCheckCollator.java:153)
             [junit4]   2> 		at org.apache.solr.handler.component.SpellCheckComponent.addCollationsToResponse(SpellCheckComponent.java:229)
             [junit4]   2> 		at org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:196)
             [junit4]   2> 		at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
             [junit4]   2> 		at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
             [junit4]   2> 		at org.apache.solr.core.SolrCore.execute(SolrCore.java:1845)
             [junit4]   2> 		at org.apache.solr.util.TestHarness.query(TestHarness.java:292)
             [junit4]   2> 		at org.apache.solr.util.TestHarness.query(TestHarness.java:274)
             [junit4]   2> 		at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:609)
             [junit4]   2> 		at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:602)
             [junit4]   2> 		at org.apache.solr.spelling.SpellCheckCollatorTest.testEstimatedHitCounts(SpellCheckCollatorTest.java:475)
          ...
             [junit4]   2> 6501 T10 oas.SolrTestCaseJ4.tearDown ###Ending testEstimatedHitCounts
             [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=SpellCheckCollatorTest -Dtests.method=testEstimatedHitCounts -Dtests.seed=16B4D8F74E59EE10 -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=nl -Dtests.timezone=America/Dawson -Dtests.file.encoding=US-ASCII
             [junit4] ERROR   0.35s | SpellCheckCollatorTest.testEstimatedHitCounts <<<
             [junit4]    > Throwable #1: java.lang.RuntimeException: Exception during query
             [junit4]    > 	at __randomizedtesting.SeedInfo.seed([16B4D8F74E59EE10:270F66C2EB66FEC0]:0)
             [junit4]    > 	at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:635)
             [junit4]    > 	at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:602)
             [junit4]    > 	at org.apache.solr.spelling.SpellCheckCollatorTest.testEstimatedHitCounts(SpellCheckCollatorTest.java:475)
             [junit4]    > 	at java.lang.Thread.run(Thread.java:724)
             [junit4]    > Caused by: java.lang.ArithmeticException: / by zero
             [junit4]    > 	at org.apache.solr.spelling.SpellCheckCollator.collate(SpellCheckCollator.java:153)
             [junit4]    > 	at org.apache.solr.handler.component.SpellCheckComponent.addCollationsToResponse(SpellCheckComponent.java:229)
             [junit4]    > 	at org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:196)
             [junit4]    > 	at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
             [junit4]    > 	at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
             [junit4]    > 	at org.apache.solr.core.SolrCore.execute(SolrCore.java:1845)
             [junit4]    > 	at org.apache.solr.util.TestHarness.query(TestHarness.java:292)
             [junit4]    > 	at org.apache.solr.util.TestHarness.query(TestHarness.java:274)
             [junit4]    > 	at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:609)
             [junit4]    > 	... 42 more
          
          Show
          Hoss Man added a comment - Attached patch applied to trunk r1511141 produces the following error... [junit4] 2> 6500 T10 C1 oasc.SolrCore.execute [collection1] webapp=null path=null params={spellcheck=true&spellcheck.dictionary=direct&spellcheck.count=1&spellcheck.collate=true&spellcheck.maxCollationTries=1&spellcheck.maxCollations=1&spellcheck.collateExtendedResults=true&qt=spellCheckCompRH&q=teststop%3Ametnoia&spellcheck.collateMaxCollectDocs=5} hits=0 status=500 QTime=25 [junit4] 2> 6501 T10 oasc.SolrException.log ERROR REQUEST FAILED: spellcheck=true&spellcheck.dictionary=direct&spellcheck.count=1&spellcheck.collate=true&spellcheck.maxCollationTries=1&spellcheck.maxCollations=1&spellcheck.collateExtendedResults=true&qt=spellCheckCompRH&q=teststop%3Ametnoia&spellcheck.collateMaxCollectDocs=5:java.lang.ArithmeticException: / by zero [junit4] 2> at org.apache.solr.spelling.SpellCheckCollator.collate(SpellCheckCollator.java:153) [junit4] 2> at org.apache.solr.handler.component.SpellCheckComponent.addCollationsToResponse(SpellCheckComponent.java:229) [junit4] 2> at org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:196) [junit4] 2> at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) [junit4] 2> at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) [junit4] 2> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1845) [junit4] 2> at org.apache.solr.util.TestHarness.query(TestHarness.java:292) [junit4] 2> at org.apache.solr.util.TestHarness.query(TestHarness.java:274) [junit4] 2> at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:609) [junit4] 2> at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:602) [junit4] 2> at org.apache.solr.spelling.SpellCheckCollatorTest.testEstimatedHitCounts(SpellCheckCollatorTest.java:475) ... [junit4] 2> 6501 T10 oas.SolrTestCaseJ4.tearDown ###Ending testEstimatedHitCounts [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=SpellCheckCollatorTest -Dtests.method=testEstimatedHitCounts -Dtests.seed=16B4D8F74E59EE10 -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=nl -Dtests.timezone=America/Dawson -Dtests.file.encoding=US-ASCII [junit4] ERROR 0.35s | SpellCheckCollatorTest.testEstimatedHitCounts <<< [junit4] > Throwable #1: java.lang.RuntimeException: Exception during query [junit4] > at __randomizedtesting.SeedInfo.seed([16B4D8F74E59EE10:270F66C2EB66FEC0]:0) [junit4] > at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:635) [junit4] > at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:602) [junit4] > at org.apache.solr.spelling.SpellCheckCollatorTest.testEstimatedHitCounts(SpellCheckCollatorTest.java:475) [junit4] > at java.lang.Thread.run(Thread.java:724) [junit4] > Caused by: java.lang.ArithmeticException: / by zero [junit4] > at org.apache.solr.spelling.SpellCheckCollator.collate(SpellCheckCollator.java:153) [junit4] > at org.apache.solr.handler.component.SpellCheckComponent.addCollationsToResponse(SpellCheckComponent.java:229) [junit4] > at org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:196) [junit4] > at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) [junit4] > at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) [junit4] > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1845) [junit4] > at org.apache.solr.util.TestHarness.query(TestHarness.java:292) [junit4] > at org.apache.solr.util.TestHarness.query(TestHarness.java:274) [junit4] > at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:609) [junit4] > ... 42 more
          Hide
          Hoss Man added a comment -

          The problematic line is when catching the EarlyTerminatingCollectorException exception and computing the estimate based on the last doc id collected...

          hits = maxDocId / ((etce.getLastDocId() + 1) / docCollectionLimit);
          

          Unless i'm mising something, the problem comes up when (etce.getLastDocId() + 1) < docCollectionLimit because then the integer division results in 0, which then becomes the demoninator under maxDocId

          It would be trivial to toss another "1+" in there to eliminate the divide by zero, but i'm confused about the basic assumption taking place here – it smells fishy – making any estimation based on getLastDocId() seems to only be useful if we know docs are being collected in order, and when the collateMaxCollectDocs option was added in r1479638, it did force in order collection when using hte early termination...

          https://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/spelling/SpellCheckCollator.java?r1=1479638&r2=1479637&pathrev=1479638

          ...but in r1479645 that use of FORCE_INORDER_COLLECTION was eliminate with the msg "removing dead code" ...

          https://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/spelling/SpellCheckCollator.java?r1=1479645&r2=1479644&pathrev=1479645

          But w/o FORCE_INORDER_COLLECTION I don't see how any estimation based on the lastDocId can ever be meaningful?

          James Dyer can you take a look at this?

          Show
          Hoss Man added a comment - The problematic line is when catching the EarlyTerminatingCollectorException exception and computing the estimate based on the last doc id collected... hits = maxDocId / ((etce.getLastDocId() + 1) / docCollectionLimit); Unless i'm mising something, the problem comes up when (etce.getLastDocId() + 1) < docCollectionLimit because then the integer division results in 0, which then becomes the demoninator under maxDocId It would be trivial to toss another "1+" in there to eliminate the divide by zero, but i'm confused about the basic assumption taking place here – it smells fishy – making any estimation based on getLastDocId() seems to only be useful if we know docs are being collected in order, and when the collateMaxCollectDocs option was added in r1479638, it did force in order collection when using hte early termination... https://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/spelling/SpellCheckCollator.java?r1=1479638&r2=1479637&pathrev=1479638 ...but in r1479645 that use of FORCE_INORDER_COLLECTION was eliminate with the msg "removing dead code" ... https://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/spelling/SpellCheckCollator.java?r1=1479645&r2=1479644&pathrev=1479645 But w/o FORCE_INORDER_COLLECTION I don't see how any estimation based on the lastDocId can ever be meaningful? James Dyer can you take a look at this?
          Hide
          Hoss Man added a comment -

          Reviewing the comments in SOLR-3240 i think i just figured out hte "remove dead code" comment...

          I'm also thinking I can safely get rid of the "forceInorderCollection" flag because requesting docs sorted by doc-id would enforce the same thing, right?

          ...i don't think this assumption is valid. I don't think using the docid sort option affects the order that collectors recieve docs, it's just used to register a SortField using SortField.Type.DOC, which isn't used until after the collector collects "all" of the docs.

          So i think we need to add back in the FORCE_INORDER_COLLECTION

          Show
          Hoss Man added a comment - Reviewing the comments in SOLR-3240 i think i just figured out hte "remove dead code" comment... I'm also thinking I can safely get rid of the "forceInorderCollection" flag because requesting docs sorted by doc-id would enforce the same thing, right? ...i don't think this assumption is valid. I don't think using the docid sort option affects the order that collectors recieve docs, it's just used to register a SortField using SortField.Type.DOC , which isn't used until after the collector collects "all" of the docs. So i think we need to add back in the FORCE_INORDER_COLLECTION
          Hide
          Hoss Man added a comment -

          FYI: I attempted ot do a simple revert of r1479645 and the test still fails – but reviewing hte diff i think that's because there doesn't seem to be anything paying attention to the FORCE_INORDER_COLLECTION flag at collection time, so it's effectively useless.

          I'm at a loss to really understand what the correct fix should be at this point

          Show
          Hoss Man added a comment - FYI: I attempted ot do a simple revert of r1479645 and the test still fails – but reviewing hte diff i think that's because there doesn't seem to be anything paying attention to the FORCE_INORDER_COLLECTION flag at collection time, so it's effectively useless. I'm at a loss to really understand what the correct fix should be at this point
          Hide
          ASF subversion and git services added a comment -

          Commit 1511539 from hossman@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1511539 ]

          SOLR-5122: disble testEstimatedHitCounts until issue with inordered collection can be dealt with

          Show
          ASF subversion and git services added a comment - Commit 1511539 from hossman@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1511539 ] SOLR-5122 : disble testEstimatedHitCounts until issue with inordered collection can be dealt with
          Hide
          ASF subversion and git services added a comment -

          Commit 1511540 from hossman@apache.org in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1511540 ]

          SOLR-5122: disble testEstimatedHitCounts until issue with inordered collection can be dealt with (merge r1511539)

          Show
          ASF subversion and git services added a comment - Commit 1511540 from hossman@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1511540 ] SOLR-5122 : disble testEstimatedHitCounts until issue with inordered collection can be dealt with (merge r1511539)
          Hide
          James Dyer added a comment -

          Hoss,

          I appreciate your reporting this & taking care of this as much as possible. Do you know offhand a failing seed for this test? (I've been away for awhile and might not have the jenkins log easily available.) I will look at this. Likely, I need to require docs to be collected in order and mistakenly thought this was unnecessary.

          Show
          James Dyer added a comment - Hoss, I appreciate your reporting this & taking care of this as much as possible. Do you know offhand a failing seed for this test? (I've been away for awhile and might not have the jenkins log easily available.) I will look at this. Likely, I need to require docs to be collected in order and mistakenly thought this was unnecessary.
          Hide
          Hoss Man added a comment -

          The initial jenkins failure i saw was "At revision 1511278"...

          https://builds.apache.org/job/Lucene-Solr-NightlyTests-trunk/343/
          https://mail-archives.apache.org/mod_mbox/lucene-dev/201308.mbox/%3Calpine.DEB.2.02.1308070919170.13959@frisbee%3E

          I can reproduce this – it's probably related to the MP randomization i
          put in ... looks like it's doing exact numeric comparisons based on term
          stats. I'll take a look later today...

          ant test -Dtestcase=SpellCheckCollatorTest -Dtests.method=testEstimatedHitCounts -Dtests.seed=16B4D8F74E59EE10 -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=nl -Dtests.timezone=America/Dawson -Dtests.file.encoding=US-ASCII

          ...regardless of he initial failure though, if you try out the patch i attached to try and improve the test coverage, then the "reproduce" line from the failure i posted along iwth that patch still reproduces on trunk (but you do have to manually uncomment the @Ignore...

          ant test  -Dtestcase=SpellCheckCollatorTest -Dtests.method=testEstimatedHitCounts -Dtests.seed=16B4D8F74E59EE10 -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=nl -Dtests.timezone=America/Dawson -Dtests.file.encoding=US-ASCII
          
          Show
          Hoss Man added a comment - The initial jenkins failure i saw was "At revision 1511278"... https://builds.apache.org/job/Lucene-Solr-NightlyTests-trunk/343/ https://mail-archives.apache.org/mod_mbox/lucene-dev/201308.mbox/%3Calpine.DEB.2.02.1308070919170.13959@frisbee%3E I can reproduce this – it's probably related to the MP randomization i put in ... looks like it's doing exact numeric comparisons based on term stats. I'll take a look later today... ant test -Dtestcase=SpellCheckCollatorTest -Dtests.method=testEstimatedHitCounts -Dtests.seed=16B4D8F74E59EE10 -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=nl -Dtests.timezone=America/Dawson -Dtests.file.encoding=US-ASCII ...regardless of he initial failure though, if you try out the patch i attached to try and improve the test coverage, then the "reproduce" line from the failure i posted along iwth that patch still reproduces on trunk (but you do have to manually uncomment the @Ignore ... ant test -Dtestcase=SpellCheckCollatorTest -Dtests.method=testEstimatedHitCounts -Dtests.seed=16B4D8F74E59EE10 -Dtests.multiplier=2 -Dtests.nightly= true -Dtests.slow= true -Dtests.locale=nl -Dtests.timezone=America/Dawson -Dtests.file.encoding=US-ASCII
          Hide
          Hoss Man added a comment -

          updated patch to trunk and included the commenting out of the @Ignore so all ou need to do is apply this patch to reproduce with the previously mentioned seed.

          Show
          Hoss Man added a comment - updated patch to trunk and included the commenting out of the @Ignore so all ou need to do is apply this patch to reproduce with the previously mentioned seed.
          Hide
          James Dyer added a comment -

          The scenarios tested in testEstimatedHitCounts() seem to always pick a collector that does not accept docs out-of-order ("TopFieldCollector$OneComparatorNonScoringCollector"). The problem looks like when a new segment/scorer is set, we get a new set of doc id's. So prior to random merges, the test naively assummed everything was on 1 segment. Now with multiple, all bets are off and I don't think we can be estimating hits.

          I think the best fix is to dial back the functionality here and not offer hit estimates at all. The functionality still would be beneficial in cases the user did not require hit-counts to be returned at all (for instance, ~rmuir mentioned using this feature with suggesters).

          Another option is to add together the doc ids for the various scorers that are looked at and pretend this is your max doc id. I'm torn here because I'd hate to remove functionality that has been released but on the other hand if it is always going to give lousy estimates then why fool people?

          Thoughts?

          Show
          James Dyer added a comment - The scenarios tested in testEstimatedHitCounts() seem to always pick a collector that does not accept docs out-of-order ("TopFieldCollector$OneComparatorNonScoringCollector"). The problem looks like when a new segment/scorer is set, we get a new set of doc id's. So prior to random merges, the test naively assummed everything was on 1 segment. Now with multiple, all bets are off and I don't think we can be estimating hits. I think the best fix is to dial back the functionality here and not offer hit estimates at all. The functionality still would be beneficial in cases the user did not require hit-counts to be returned at all (for instance, ~rmuir mentioned using this feature with suggesters). Another option is to add together the doc ids for the various scorers that are looked at and pretend this is your max doc id. I'm torn here because I'd hate to remove functionality that has been released but on the other hand if it is always going to give lousy estimates then why fool people? Thoughts?
          Hide
          Hoss Man added a comment -

          So prior to random merges, the test naively assummed everything was on 1 segment. Now with multiple, all bets are off and I don't think we can be estimating hits.

          I'm not following you here – why don't you think the basic approach to estimation can still work?

          the only missing pieces seem to be that when an estimation is requested:

          • docs must be collected in order – a property that forces this behavior from EarlyTerminatingCollector.acceptsDocsOutOfOrder regardless of what the delegate cares about should do the trick.
          • lastDocId needs to be absolute, not per-segment – which could be done by tracking the reader offsets in EarlyTerminatingCollector.setNextReader and using that offset when assigning lastDocId in EarlyTerminatingCollector.collect

          ...and that should make it work as you previously designed it ... right?

          Show
          Hoss Man added a comment - So prior to random merges, the test naively assummed everything was on 1 segment. Now with multiple, all bets are off and I don't think we can be estimating hits. I'm not following you here – why don't you think the basic approach to estimation can still work? the only missing pieces seem to be that when an estimation is requested: docs must be collected in order – a property that forces this behavior from EarlyTerminatingCollector.acceptsDocsOutOfOrder regardless of what the delegate cares about should do the trick. lastDocId needs to be absolute, not per-segment – which could be done by tracking the reader offsets in EarlyTerminatingCollector.setNextReader and using that offset when assigning lastDocId in EarlyTerminatingCollector.collect ...and that should make it work as you previously designed it ... right?
          Hide
          Hoss Man added a comment -

          here's a patch that improves EarlyTerminatingCollector to keep track of the size of each reader it collects against so that it can derive some meaning from the docIds it collects. As part of this patch i eliminated the use of the "lastDocId" to try and discourage people from trying to find specific – instead the EarlyTerminatingCollectorException now just reports the number of docs "collected" out of the total number of docs "scanned" ... the result is that the collector doesn't really care which order it gets the AtomicReaderContexts in, however it still has to force documents to be collected in order, so that they will be in-order within a single reader so that the stats for that reader can be meaningful.

          patch includes the previous tests, plus a new test loop that we get a reasonably accurate estimate from a term that is in every other doc in the index.

          James Dyer - does this look right to you? does it address your concerns about keeping hte estimation code in place?

          Show
          Hoss Man added a comment - here's a patch that improves EarlyTerminatingCollector to keep track of the size of each reader it collects against so that it can derive some meaning from the docIds it collects. As part of this patch i eliminated the use of the "lastDocId" to try and discourage people from trying to find specific – instead the EarlyTerminatingCollectorException now just reports the number of docs "collected" out of the total number of docs "scanned" ... the result is that the collector doesn't really care which order it gets the AtomicReaderContexts in, however it still has to force documents to be collected in order, so that they will be in-order within a single reader so that the stats for that reader can be meaningful. patch includes the previous tests, plus a new test loop that we get a reasonably accurate estimate from a term that is in every other doc in the index. James Dyer - does this look right to you? does it address your concerns about keeping hte estimation code in place?
          Hide
          James Dyer added a comment -

          Hoss,

          This looks reasonable to me. The test is more forgiving to variations caused by random merges, no more integer division, etc. I appreciate your working on this as I wouldn't have much more time until next week. I think your method of estimating the hits would be at least as good of what I attempted to do before.

          Show
          James Dyer added a comment - Hoss, This looks reasonable to me. The test is more forgiving to variations caused by random merges, no more integer division, etc. I appreciate your working on this as I wouldn't have much more time until next week. I think your method of estimating the hits would be at least as good of what I attempted to do before.
          Hide
          ASF subversion and git services added a comment -

          Commit 1514402 from hossman@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1514402 ]

          SOLR-5122: Fixed bug in spellcheck.collateMaxCollectDocs. Eliminates risk of divide by zero, and makes estimated hit counts meaningful in non-optimized indexes.

          Show
          ASF subversion and git services added a comment - Commit 1514402 from hossman@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1514402 ] SOLR-5122 : Fixed bug in spellcheck.collateMaxCollectDocs. Eliminates risk of divide by zero, and makes estimated hit counts meaningful in non-optimized indexes.
          Hide
          ASF subversion and git services added a comment -

          Commit 1514408 from hossman@apache.org in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1514408 ]

          SOLR-5122: Fixed bug in spellcheck.collateMaxCollectDocs. Eliminates risk of divide by zero, and makes estimated hit counts meaningful in non-optimized indexes. (merge r1514402)

          Show
          ASF subversion and git services added a comment - Commit 1514408 from hossman@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1514408 ] SOLR-5122 : Fixed bug in spellcheck.collateMaxCollectDocs. Eliminates risk of divide by zero, and makes estimated hit counts meaningful in non-optimized indexes. (merge r1514402)
          Hide
          Hoss Man added a comment -

          Committed revision 1514402.
          Committed revision 1514408.

          Show
          Hoss Man added a comment - Committed revision 1514402. Committed revision 1514408.
          Hide
          ASF subversion and git services added a comment -

          Commit 1514494 from hossman@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1514494 ]

          SOLR-5122: fix javadocs

          Show
          ASF subversion and git services added a comment - Commit 1514494 from hossman@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1514494 ] SOLR-5122 : fix javadocs
          Hide
          ASF subversion and git services added a comment -

          Commit 1514504 from hossman@apache.org in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1514504 ]

          SOLR-5122: fix javadocs (merge r1514494)

          Show
          ASF subversion and git services added a comment - Commit 1514504 from hossman@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1514504 ] SOLR-5122 : fix javadocs (merge r1514494)
          Hide
          Adrien Grand added a comment -

          4.5 release -> bulk close

          Show
          Adrien Grand added a comment - 4.5 release -> bulk close

            People

            • Assignee:
              Hoss Man
              Reporter:
              Hoss Man
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development