Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-11454

relax newSearcher based time checks in SoftAutoCommitTest

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      The point of SoftAutoCommitTest is to ensure that auto-commits fire when expected.

      The timing based checks on those autocommits are a semi-neccessary evil to ensure that the test doesn't get false positives due to some other commits.
      The test also sanity checks that auto-commits result in newSearcher events if/when they should – but these also (currently) have timing checks ot ensure that they happen "fast enough" ... this seems unneccessary (given the purpose of hte test) and broken (there's no guarantee/expectation how fast a searcher will open, even though the test assumes it will be a number relative to the autocommit setting.

      we should relax these assertions, and just ensure that the searcher eventually opens in a non-absurd amount of time, not fail if it isn't some specific math function relative to other events.

      Attachments

        1. SoftAutoCommitTest.jenkins.178.txt
          9.76 MB
          Chris M. Hostetter

        Issue Links

          Activity

            Here's an example of a jenkins failure (the full logs are no longer available)...

            https://builds.apache.org/job/Lucene-Solr-Tests-7.x/163/

            FAILED:  org.apache.solr.update.SoftAutoCommitTest.testSoftAndHardCommitMaxTimeMixedAdds
            
            Error Message:
            hard529 was before searcher529: 6264322578200422 !<= 6264322016290268
            
            Stack Trace:
            java.lang.AssertionError: hard529 was before searcher529: 6264322578200422 !<= 6264322016290268
                    at __randomizedtesting.SeedInfo.seed([995B4A163C40F912:C88FB3968D33C9B5]:0)
                    at org.junit.Assert.fail(Assert.java:93)
                    at org.junit.Assert.assertTrue(Assert.java:43)
                    at org.apache.solr.update.SoftAutoCommitTest.testSoftAndHardCommitMaxTimeMixedAdds(SoftAutoCommitTest.java:230)
            
            

            The intent here is to "verify" that the searcher was open because of the soft commit – not the hard commit. But there is no garuntee that a newSearcher event triggered by the (auto)SoftCommit will finish before a subsequent hard commit.

            we should remove this (type of) check and instead either:

            • wait for an extended period after the hard commit to assert there isn't a second newSearcher
            • add a varient of these test methods where hardAutoCommit is disabled to ensure the searcher we eventually get is in fact because of the (auto)softCommit
              • it would probably be pretty easy to parameterize the softCommitWaitMillis and hardCommitWaitMillis in methods, so a variant could use hardCommitWaitMillis=-1, and then do the hardCommit assertions conditionally if and only if 0 < hardCommitWaitMillis
            hossman Chris M. Hostetter added a comment - Here's an example of a jenkins failure (the full logs are no longer available)... https://builds.apache.org/job/Lucene-Solr-Tests-7.x/163/ FAILED: org.apache.solr.update.SoftAutoCommitTest.testSoftAndHardCommitMaxTimeMixedAdds Error Message: hard529 was before searcher529: 6264322578200422 !<= 6264322016290268 Stack Trace: java.lang.AssertionError: hard529 was before searcher529: 6264322578200422 !<= 6264322016290268 at __randomizedtesting.SeedInfo.seed([995B4A163C40F912:C88FB3968D33C9B5]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.solr.update.SoftAutoCommitTest.testSoftAndHardCommitMaxTimeMixedAdds(SoftAutoCommitTest.java:230) The intent here is to "verify" that the searcher was open because of the soft commit – not the hard commit. But there is no garuntee that a newSearcher event triggered by the (auto)SoftCommit will finish before a subsequent hard commit. we should remove this (type of) check and instead either: wait for an extended period after the hard commit to assert there isn't a second newSearcher add a varient of these test methods where hardAutoCommit is disabled to ensure the searcher we eventually get is in fact because of the (auto)softCommit it would probably be pretty easy to parameterize the softCommitWaitMillis and hardCommitWaitMillis in methods, so a variant could use hardCommitWaitMillis=-1, and then do the hardCommit assertions conditionally if and only if 0 < hardCommitWaitMillis

            Here's another type of failure, full jenkins log attached as SoftAutoCommitTest.jenkins.178.txt

            https://builds.apache.org/job/Lucene-Solr-Tests-7.x/178/

            Error Message:
            searcher529 wasn't soon enough after soft529: Took 1143ms, >= acceptable 300ms (fudge)
            
            Stack Trace:
            java.lang.AssertionError: searcher529 wasn't soon enough after soft529: Took 1143ms, >= acceptable 300ms (fudge)
                    at __randomizedtesting.SeedInfo.seed([37A167B6A7C72C70:F0EDDF2BBC6FE1C0]:0)
                    at org.junit.Assert.fail(Assert.java:93)
                    at org.junit.Assert.assertTrue(Assert.java:43)
                    at org.apache.solr.update.SoftAutoCommitTest.testSoftAndHardCommitMaxTimeDelete(SoftAutoCommitTest.java:366)
            

            In this case we can see from the logs that the Opening [Searcher@6f2d6505[collection1] main] happend the exact same millisecond that the soft commit started & ended – but because the event listeners don't fire until the end of the event, this assertion considers the newSearcher too slow (relative to how long the softCommit itself took)

            we should remove this variable "fudge" and just assert that the searcher opened eventually.

            hossman Chris M. Hostetter added a comment - Here's another type of failure, full jenkins log attached as SoftAutoCommitTest.jenkins.178.txt https://builds.apache.org/job/Lucene-Solr-Tests-7.x/178/ Error Message: searcher529 wasn't soon enough after soft529: Took 1143ms, >= acceptable 300ms (fudge) Stack Trace: java.lang.AssertionError: searcher529 wasn't soon enough after soft529: Took 1143ms, >= acceptable 300ms (fudge) at __randomizedtesting.SeedInfo.seed([37A167B6A7C72C70:F0EDDF2BBC6FE1C0]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.solr.update.SoftAutoCommitTest.testSoftAndHardCommitMaxTimeDelete(SoftAutoCommitTest.java:366) In this case we can see from the logs that the Opening [Searcher@6f2d6505[collection1] main] happend the exact same millisecond that the soft commit started & ended – but because the event listeners don't fire until the end of the event, this assertion considers the newSearcher too slow (relative to how long the softCommit itself took) we should remove this variable "fudge" and just assert that the searcher opened eventually.

            NOTE: I initially filed this as a sub-task of SOLR-11053, but in hindsight i think that was a mistake. It's a general improvement that doesn't relate to if/when SoftAutoCommitTest is better then (Hard)AutoCommitTest – so i've converted to a top level Improvement task

            hossman Chris M. Hostetter added a comment - NOTE: I initially filed this as a sub-task of SOLR-11053 , but in hindsight i think that was a mistake. It's a general improvement that doesn't relate to if/when SoftAutoCommitTest is better then (Hard)AutoCommitTest – so i've converted to a top level Improvement task

            People

              Unassigned Unassigned
              hossman Chris M. Hostetter
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated: