Solr
  1. Solr
  2. SOLR-7990

timeAllowed is returning wrong results on the same query submitted with different timeAllowed limits

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 5.2.1, 5.4, 6.0
    • Fix Version/s: 5.3.1
    • Component/s: None
    • Labels:
      None

      Description

      William Bell raised a question on the user's list. The scenario is
      > send a query that exceeds timeAllowed
      > send another identical query with larger timeAllowed that does NOT time out

      The results from the second query are not correct, they reflect the doc count from the first query.

      It apparently has to do with filter queries being inappropriately created and re-used. I've attached a test case that illustrates the problem.

      There are three tests here.
      testFilterSimpleCase shows the problem.

      testCacheAssumptions is my hack at what I think the states of the caches should be, but has a bunch of clutter so I'm Ignoring it for now. This should be un-ignored and testFilterSimpleCase removed when there's any fix proposed. The assumptions may not be correct though.

      testQueryResults shows what I think is a problem, the second call that does NOT exceed timeAllowed still reports partial results.

      1. SOLR-7990_filterFix.patch
        1 kB
        Yonik Seeley
      2. SOLR-7990.patch
        17 kB
        Yonik Seeley
      3. SOLR-7990.patch
        15 kB
        Yonik Seeley
      4. SOLR-7990.patch
        9 kB
        Yonik Seeley
      5. SOLR-7990.patch
        2 kB
        Yonik Seeley
      6. SOLR-7990.patch
        9 kB
        Erick Erickson
      7. SOLR-7990.patch
        11 kB
        Erick Erickson

        Activity

        Hide
        Erick Erickson added a comment -

        Failing test cases, no fixes though.

        Show
        Erick Erickson added a comment - Failing test cases, no fixes though.
        Hide
        Erick Erickson added a comment -

        testQueryResults was a bad test, didn't test for null so that test was inappropriately failing when looking for partial results after the second search

        Show
        Erick Erickson added a comment - testQueryResults was a bad test, didn't test for null so that test was inappropriately failing when looking for partial results after the second search
        Hide
        Yonik Seeley added a comment - - edited

        Here's a patch that should fix the filter issue.

        The bad code was in getDocSetNC (I have since moved that code to DocSetUtil) that caught the exception and logged a warning but did nothing else.

        Since we don't know the context that the DocSet will be used in, we can't catch the exception there. It may be cached later. It may even be negated - there are multiple places this can happen, such as a pure-negative filter query. In that case, terminating early could result in matching documents that would otherwise have never matched!

        Here's a simple match removing the exception swallow.

        edit: this isn't a complete patch - I haven't run tests, and I don't know if this breaks other tests.

        Show
        Yonik Seeley added a comment - - edited Here's a patch that should fix the filter issue. The bad code was in getDocSetNC (I have since moved that code to DocSetUtil) that caught the exception and logged a warning but did nothing else. Since we don't know the context that the DocSet will be used in, we can't catch the exception there. It may be cached later. It may even be negated - there are multiple places this can happen, such as a pure-negative filter query. In that case, terminating early could result in matching documents that would otherwise have never matched! Here's a simple match removing the exception swallow. edit: this isn't a complete patch - I haven't run tests, and I don't know if this breaks other tests.
        Hide
        Erick Erickson added a comment -

        Yonik Seeley Cool, I just assigned it to myself to not lose track it, wasn't going to be getting around to it for a while. Want me to run with it?

        Show
        Erick Erickson added a comment - Yonik Seeley Cool, I just assigned it to myself to not lose track it, wasn't going to be getting around to it for a while. Want me to run with it?
        Hide
        Yonik Seeley added a comment -

        Yeah sure, I was just running the full tests, and it does look like some fail now. Not sure if it's a test bug or if we're missing a high level catch of this exception...

        Show
        Yonik Seeley added a comment - Yeah sure, I was just running the full tests, and it does look like some fail now. Not sure if it's a test bug or if we're missing a high level catch of this exception...
        Hide
        Erick Erickson added a comment -

        I don't particularly have any investment in owning this, mostly volunteering to do the scutwork......

        Show
        Erick Erickson added a comment - I don't particularly have any investment in owning this, mostly volunteering to do the scutwork......
        Hide
        Erick Erickson added a comment -

        The test I added for this condition still fails though for a start.. testSimpleFilterCase...

        Show
        Erick Erickson added a comment - The test I added for this condition still fails though for a start.. testSimpleFilterCase...
        Hide
        Yonik Seeley added a comment -

        This patch makes all the tests pass and should fix the issue with a partial filter being cached.

        Still needs a test that fails w/o it though. I'll also try to look at why your test fails.

        Show
        Yonik Seeley added a comment - This patch makes all the tests pass and should fix the issue with a partial filter being cached. Still needs a test that fails w/o it though. I'll also try to look at why your test fails.
        Hide
        Yonik Seeley added a comment -

        Actually, I may just try to rewrite the ExitableDirectoryReaderTest, as that test uses no caches and hence can't test if it's caching any partial results.

        Show
        Yonik Seeley added a comment - Actually, I may just try to rewrite the ExitableDirectoryReaderTest, as that test uses no caches and hence can't test if it's caching any partial results.
        Hide
        Erick Erickson added a comment -

        Right, I ripped this test off ExitableDirectoryReaderTest just to get a fast-n-dirty isolation of the issue, see the nocommit comments .

        So yeah, piggy-backing this off ExitableDirectoryTest seems like A Good Thing.

        Show
        Erick Erickson added a comment - Right, I ripped this test off ExitableDirectoryReaderTest just to get a fast-n-dirty isolation of the issue, see the nocommit comments . So yeah, piggy-backing this off ExitableDirectoryTest seems like A Good Thing.
        Hide
        Yonik Seeley added a comment -

        Here's the latest patch with the rewritten ExitableDirectoryReaderTest. It fails without the patch and passes with it.

        Show
        Yonik Seeley added a comment - Here's the latest patch with the rewritten ExitableDirectoryReaderTest. It fails without the patch and passes with it.
        Hide
        Yonik Seeley added a comment -

        OK, I figured out why your test was failing... there were no name:a* docs because you overwrote them all with name:b* docs in createIndex

        Show
        Yonik Seeley added a comment - OK, I figured out why your test was failing... there were no name:a* docs because you overwrote them all with name:b* docs in createIndex
        Hide
        Erick Erickson added a comment - - edited

        Seems I'm the statue more often than the pigeon lately....

        assertU(adoc("id", Integer.toString(idx), "name", "b" + idx + NUM_DOCS));

        should have been

        assertU(adoc("id", Integer.toString(idx + NUM_DOCS), "name", "b" + idx));

        Oops.

        Show
        Erick Erickson added a comment - - edited Seems I'm the statue more often than the pigeon lately.... assertU(adoc("id", Integer.toString(idx), "name", "b" + idx + NUM_DOCS)); should have been assertU(adoc("id", Integer.toString(idx + NUM_DOCS), "name", "b" + idx)); Oops.
        Hide
        Yonik Seeley added a comment -

        I'm working on test fixes... my test changes were no good since the timeAllowed does not include query parsing time (which is where the time goes in the sleep function).

        Show
        Yonik Seeley added a comment - I'm working on test fixes... my test changes were no good since the timeAllowed does not include query parsing time (which is where the time goes in the sleep function).
        Hide
        Yonik Seeley added a comment -

        Here's the final patch.
        I also incorporated parts of Erick's test and modified the delay component to take a parameter that tells it how long to sleep.

        Show
        Yonik Seeley added a comment - Here's the final patch. I also incorporated parts of Erick's test and modified the delay component to take a parameter that tells it how long to sleep.
        Hide
        Yonik Seeley added a comment -

        Oops, forgot to take care of the cloud version of the test. Here's another patch.

        Show
        Yonik Seeley added a comment - Oops, forgot to take care of the cloud version of the test. Here's another patch.
        Hide
        ASF subversion and git services added a comment -

        Commit 1702015 from Yonik Seeley in branch 'dev/trunk'
        [ https://svn.apache.org/r1702015 ]

        SOLR-7990: don't swallow ExitingReaderException

        Show
        ASF subversion and git services added a comment - Commit 1702015 from Yonik Seeley in branch 'dev/trunk' [ https://svn.apache.org/r1702015 ] SOLR-7990 : don't swallow ExitingReaderException
        Hide
        ASF subversion and git services added a comment -

        Commit 1702016 from Yonik Seeley in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1702016 ]

        SOLR-7990: don't swallow ExitingReaderException

        Show
        ASF subversion and git services added a comment - Commit 1702016 from Yonik Seeley in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1702016 ] SOLR-7990 : don't swallow ExitingReaderException
        Hide
        Yonik Seeley added a comment -

        Merging back to 5.3.1 had conflicts, so needed some backport... running full tests now.

        Show
        Yonik Seeley added a comment - Merging back to 5.3.1 had conflicts, so needed some backport... running full tests now.
        Hide
        ASF subversion and git services added a comment -

        Commit 1702037 from Yonik Seeley in branch 'dev/branches/lucene_solr_5_3'
        [ https://svn.apache.org/r1702037 ]

        SOLR-7990: don't swallow ExitingReaderException

        Show
        ASF subversion and git services added a comment - Commit 1702037 from Yonik Seeley in branch 'dev/branches/lucene_solr_5_3' [ https://svn.apache.org/r1702037 ] SOLR-7990 : don't swallow ExitingReaderException

          People

          • Assignee:
            Yonik Seeley
            Reporter:
            Erick Erickson
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development