Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 7.0, 6.3.1
    • Fix Version/s: 7.0, 6.4
    • Component/s: None
    • Labels:
    • Lucene Fields:
      New, Patch Available

      Description

      Currently DrillSideways implementation is based on the single threaded IndexSearcher.search(Query query, Collector results).

      On large document set, the single threaded collection can be really slow.

      The ParallelDrillSideways implementation could:

      1. Use the CollectionManager based method IndexSearcher.search(Query query, CollectorManager collectorManager) to get the benefits of multithreading on index segments,
      2. Compute each DrillSideway subquery on a single thread.

      1. LUCENE-7588.patch
        63 kB
        Emmanuel Keller
      2. lucene-7588-sort-fix.patch
        3 kB
        Emmanuel Keller

        Activity

        Hide
        ekeller Emmanuel Keller added a comment -

        Hi, I was wondering what is the current process for this kind of patch proposal. I suppose there is a review process. Let me know how I can help. Thanks.

        Show
        ekeller Emmanuel Keller added a comment - Hi, I was wondering what is the current process for this kind of patch proposal. I suppose there is a review process. Let me know how I can help. Thanks.
        Hide
        erickerickson Erick Erickson added a comment -

        Well, for DrillSideways I'd ping Michael McCandless for the quickest read if he has the time.

        Basically you gently prompt the JIRA from time to time and see if you can get someone's attention.

        Perhaps a short description of the approach the patch takes would help orient someone who's looking at it.

        Show
        erickerickson Erick Erickson added a comment - Well, for DrillSideways I'd ping Michael McCandless for the quickest read if he has the time. Basically you gently prompt the JIRA from time to time and see if you can get someone's attention. Perhaps a short description of the approach the patch takes would help orient someone who's looking at it.
        Hide
        mikemccand Michael McCandless added a comment -

        Sorry, I have been meaning to have a look at this cool idea/patch, and what you've done (open issue, put patch up, gently nudge) is exactly the right process! Thank you Emmanuel Keller ... I'll have a look soon.

        Show
        mikemccand Michael McCandless added a comment - Sorry, I have been meaning to have a look at this cool idea/patch, and what you've done (open issue, put patch up, gently nudge) is exactly the right process! Thank you Emmanuel Keller ... I'll have a look soon.
        Hide
        ekeller Emmanuel Keller added a comment - - edited

        Thanks for your feedback guys, it's pretty clear. FYI, the patch includes unit tests derived from the already existing test on facets.

        Show
        ekeller Emmanuel Keller added a comment - - edited Thanks for your feedback guys, it's pretty clear. FYI, the patch includes unit tests derived from the already existing test on facets.
        Hide
        mikemccand Michael McCandless added a comment -

        Thanks Emmanuel Keller: this is an impressive change!

        Can you add a minimal javadocs to ParallelDrillSideways, and include @lucene.experimental?

        Can you fix the indent to 2 spaces, and change your IDE to not use
        wildcard imports? (Most of the new classes seem to do so, but at
        least one didn't). Or we can fix this up before pushing...

        Should CallableCollector be renamed to CallableCollectorManager?

        I assume you're using this for your QWAZR search server built on lucene (https://github.com/qwazr/QWAZR)? Thank you for giving back!

        There are quite a few new abstractions here,
        MultiCollectorManager, FacetsCollectorManager; must they be
        public? Can you explain what they do?

        It seems like this change opens up concurrency in 2 ways; the first
        way is it uses the IndexSearcher.search API that takes a
        CollectorManager such that if you had created that
        IndexSearcher with an executor, you get concurrency across the
        segments in the index. In general I'm not a huge fan of this
        concurrency since you are at the whim of how the segments are
        structured, and, confusingly, running forceMerge(1) on your index
        removes all concurrency. But it's better than nothing: progress not
        perfection!

        The second way is that the new ParallelDrillSideways takes its own
        executor and then runs the N DrillDown queries concurrently (to
        compute the sideways counts), which is very different from the current
        doc-at-a-time computation. Have you compared the performance, using a
        single thread? ... I'm curious how "doc at a time" vs "query at a
        time" (which is also Solr's approach) compare. But, still, the fact
        that this "query at a time" approach enables concurrency is a big win.

        I wonder if we could absorb ParallelDrillSideways under
        DrillSideways such that if you pass an executor it uses the
        concurrent implementation? It's really an implementation/execution
        detail I think? Similar to how IndexSearcher takes an optional
        executor.

        Show
        mikemccand Michael McCandless added a comment - Thanks Emmanuel Keller : this is an impressive change! Can you add a minimal javadocs to ParallelDrillSideways , and include @lucene.experimental ? Can you fix the indent to 2 spaces, and change your IDE to not use wildcard imports? (Most of the new classes seem to do so, but at least one didn't). Or we can fix this up before pushing... Should CallableCollector be renamed to CallableCollectorManager ? I assume you're using this for your QWAZR search server built on lucene ( https://github.com/qwazr/QWAZR)? Thank you for giving back! There are quite a few new abstractions here, MultiCollectorManager , FacetsCollectorManager ; must they be public? Can you explain what they do? It seems like this change opens up concurrency in 2 ways; the first way is it uses the IndexSearcher.search API that takes a CollectorManager such that if you had created that IndexSearcher with an executor, you get concurrency across the segments in the index. In general I'm not a huge fan of this concurrency since you are at the whim of how the segments are structured, and, confusingly, running forceMerge(1) on your index removes all concurrency. But it's better than nothing: progress not perfection! The second way is that the new ParallelDrillSideways takes its own executor and then runs the N DrillDown queries concurrently (to compute the sideways counts), which is very different from the current doc-at-a-time computation. Have you compared the performance, using a single thread? ... I'm curious how "doc at a time" vs "query at a time" (which is also Solr's approach) compare. But, still, the fact that this "query at a time" approach enables concurrency is a big win. I wonder if we could absorb ParallelDrillSideways under DrillSideways such that if you pass an executor it uses the concurrent implementation? It's really an implementation/execution detail I think? Similar to how IndexSearcher takes an optional executor.
        Hide
        ekeller Emmanuel Keller added a comment -

        Can you add a minimal javadocs to ParallelDrillSideways, and include @lucene.experimental?

        Done.

        Can you fix the indent to 2 spaces, and change your IDE to not use

        wildcard imports? (Most of the new classes seem to do so, but at
        least one didn't). Or we can fix this up before pushing...

        Done.

        Should CallableCollector be renamed to CallableCollectorManager?

        True, done.

        I assume you're using this for your QWAZR search server built on lucene (https://github.com/qwazr/QWAZR)? Thank you for giving back!

        With pleasure. I think there is few more contributions to come...

        There are quite a few new abstractions here, MultiCollectorManager, FacetsCollectorManager; must they be public? Can you explain what they do?

        MultiCollectorManager do with CollectorManager what MultiCollector do with Collector. It wraps a set of CollectorManager as it was only one.

        It seems like this change opens up concurrency in 2 ways; the first
        way is it uses the IndexSearcher.search API that takes a
        CollectorManager such that if you had created that
        IndexSearcher with an executor, you get concurrency across the
        segments in the index. In general I'm not a huge fan of this
        concurrency since you are at the whim of how the segments are
        structured, and, confusingly, running forceMerge(1) on your index
        removes all concurrency. But it's better than nothing: progress not
        perfection!

        I agree. That's a first step.

        The second way is that the new ParallelDrillSideways takes its own
        executor and then runs the N DrillDown queries concurrently (to
        compute the sideways counts), which is very different from the current
        doc-at-a-time computation. Have you compared the performance, using a
        single thread? ... I'm curious how "doc at a time" vs "query at a
        time" (which is also Solr's approach) compare. But, still, the fact
        that this "query at a time" approach enables concurrency is a big win.

        I am working on providing a benchmark. What is the good practice for Lucene ? It it okay to provide a benchmark as a test case ?

        I wonder if we could absorb ParallelDrillSideways under
        DrillSideways such that if you pass an executor it uses the
        concurrent implementation? It's really an implementation/execution
        detail I think? Similar to how IndexSearcher takes an optional
        executor.

        I agree. I think that it is the way it should be. I don't wanted to be too intrusive.

        Show
        ekeller Emmanuel Keller added a comment - Can you add a minimal javadocs to ParallelDrillSideways, and include @lucene.experimental? Done. Can you fix the indent to 2 spaces, and change your IDE to not use wildcard imports? (Most of the new classes seem to do so, but at least one didn't). Or we can fix this up before pushing... Done. Should CallableCollector be renamed to CallableCollectorManager? True, done. I assume you're using this for your QWAZR search server built on lucene ( https://github.com/qwazr/QWAZR)? Thank you for giving back! With pleasure. I think there is few more contributions to come... There are quite a few new abstractions here, MultiCollectorManager, FacetsCollectorManager; must they be public? Can you explain what they do? MultiCollectorManager do with CollectorManager what MultiCollector do with Collector. It wraps a set of CollectorManager as it was only one. It seems like this change opens up concurrency in 2 ways; the first way is it uses the IndexSearcher.search API that takes a CollectorManager such that if you had created that IndexSearcher with an executor, you get concurrency across the segments in the index. In general I'm not a huge fan of this concurrency since you are at the whim of how the segments are structured, and, confusingly, running forceMerge(1) on your index removes all concurrency. But it's better than nothing: progress not perfection! I agree. That's a first step. The second way is that the new ParallelDrillSideways takes its own executor and then runs the N DrillDown queries concurrently (to compute the sideways counts), which is very different from the current doc-at-a-time computation. Have you compared the performance, using a single thread? ... I'm curious how "doc at a time" vs "query at a time" (which is also Solr's approach) compare. But, still, the fact that this "query at a time" approach enables concurrency is a big win. I am working on providing a benchmark. What is the good practice for Lucene ? It it okay to provide a benchmark as a test case ? I wonder if we could absorb ParallelDrillSideways under DrillSideways such that if you pass an executor it uses the concurrent implementation? It's really an implementation/execution detail I think? Similar to how IndexSearcher takes an optional executor. I agree. I think that it is the way it should be. I don't wanted to be too intrusive.
        Hide
        ekeller Emmanuel Keller added a comment -

        New patch with the following changes:

        • Fixes copyright and indentations issues
        • FacetCollectorManager is no more public.
        • MultiCollectorManager moved to the right package: org.apache.lucene.search
        • Add many Javadoc
        Show
        ekeller Emmanuel Keller added a comment - New patch with the following changes: Fixes copyright and indentations issues FacetCollectorManager is no more public. MultiCollectorManager moved to the right package: org.apache.lucene.search Add many Javadoc
        Hide
        mikemccand Michael McCandless added a comment -

        I am working on providing a benchmark. What is the good practice for Lucene ? It it okay to provide a benchmark as a test case ?

        We don't usually do benchmarks as test cases; we could e.g. push the benchmark sources to https://github.com/mikemccand/luceneutil which holds various random Lucene benchmarking utilities. Or if you just have some simple results to share w/o having the full source code to share, that's better than nothing too

        Hmm it looks like nothing is testing the new ParallelDrillSideways.search?

        I wonder if we could absorb ParallelDrillSideways under DrillSideways such that if you pass an executor it uses the concurrent implementation? It's really an implementation/execution detail I think? Similar to how IndexSearcher takes an optional executor.

        I agree. I think that it is the way it should be. I don't wanted to be too intrusive.

        Maybe we could just add another ctor to DrillSideways taking all
        the current arguments, plus an executor? I.e.:

          public DrillSideways(IndexSearcher searcher, FacetsConfig config, TaxonomyReader taxoReader, SortedSetDocValuesReaderState state, ExecutorService executor) {
            ...
          }
        

        Then, in the DrillSideways.search method, if executor is non-null,
        we invoke the concurrent version (ParallelDrillSideways.search
        from your patch) internally, as a private method?

        Show
        mikemccand Michael McCandless added a comment - I am working on providing a benchmark. What is the good practice for Lucene ? It it okay to provide a benchmark as a test case ? We don't usually do benchmarks as test cases; we could e.g. push the benchmark sources to https://github.com/mikemccand/luceneutil which holds various random Lucene benchmarking utilities. Or if you just have some simple results to share w/o having the full source code to share, that's better than nothing too Hmm it looks like nothing is testing the new ParallelDrillSideways.search ? I wonder if we could absorb ParallelDrillSideways under DrillSideways such that if you pass an executor it uses the concurrent implementation? It's really an implementation/execution detail I think? Similar to how IndexSearcher takes an optional executor. I agree. I think that it is the way it should be. I don't wanted to be too intrusive. Maybe we could just add another ctor to DrillSideways taking all the current arguments, plus an executor? I.e.: public DrillSideways(IndexSearcher searcher, FacetsConfig config, TaxonomyReader taxoReader, SortedSetDocValuesReaderState state, ExecutorService executor) { ... } Then, in the DrillSideways.search method, if executor is non-null, we invoke the concurrent version ( ParallelDrillSideways.search from your patch) internally, as a private method?
        Hide
        ekeller Emmanuel Keller added a comment - - edited

        New patch:
        1. In the DrillSideways.search method, if executor is non-null, we invoke the concurrent version.
        2. The unit test tests effectively the new concurrent methods.

        I work on the benchmark now. Michael McCandless I will submit a new bench to your repo luceneutils.

        Show
        ekeller Emmanuel Keller added a comment - - edited New patch: 1. In the DrillSideways.search method, if executor is non-null, we invoke the concurrent version. 2. The unit test tests effectively the new concurrent methods. I work on the benchmark now. Michael McCandless I will submit a new bench to your repo luceneutils.
        Hide
        mikemccand Michael McCandless added a comment -

        Thanks Emmanuel Keller, this looks great!

        Can we maybe simplify the generics and hard-wire R to TopDocs always?

        Show
        mikemccand Michael McCandless added a comment - Thanks Emmanuel Keller , this looks great! Can we maybe simplify the generics and hard-wire R to TopDocs always?
        Hide
        ekeller Emmanuel Keller added a comment -

        One using the search method while providing a CollectorManager expects to be able to extract the result.

             public <R> ConcurrentDrillSidewaysResult<R> search(final DrillDownQuery query,
                      final CollectorManager<?, R> hitCollectorManager) throws IOException {}
        

        In the meantime, I forgot to expose the new result class "ConcurrentDrillSidewaysResult<R>".
        It is fixed now, I submit the new patch.

        Show
        ekeller Emmanuel Keller added a comment - One using the search method while providing a CollectorManager expects to be able to extract the result. public <R> ConcurrentDrillSidewaysResult<R> search( final DrillDownQuery query, final CollectorManager<?, R> hitCollectorManager) throws IOException {} In the meantime, I forgot to expose the new result class "ConcurrentDrillSidewaysResult<R>". It is fixed now, I submit the new patch.
        Hide
        mikemccand Michael McCandless added a comment -

        Thanks Emmanuel Keller, I'll have a look likely once I'm back from vacation next year.

        Show
        mikemccand Michael McCandless added a comment - Thanks Emmanuel Keller , I'll have a look likely once I'm back from vacation next year.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit b4a002f7d88a2383852e2bfd95b39bf7f6e33f2f in lucene-solr's branch refs/heads/master from Mike McCandless
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=b4a002f ]

        LUCENE-7588: DrillSideways can now run its queries concurrently

        Show
        jira-bot ASF subversion and git services added a comment - Commit b4a002f7d88a2383852e2bfd95b39bf7f6e33f2f in lucene-solr's branch refs/heads/master from Mike McCandless [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=b4a002f ] LUCENE-7588 : DrillSideways can now run its queries concurrently
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 5b6401b212da883188f45709d1f68addbbdf2c98 in lucene-solr's branch refs/heads/branch_6x from Mike McCandless
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5b6401b ]

        LUCENE-7588: DrillSideways can now run its queries concurrently

        Show
        jira-bot ASF subversion and git services added a comment - Commit 5b6401b212da883188f45709d1f68addbbdf2c98 in lucene-solr's branch refs/heads/branch_6x from Mike McCandless [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5b6401b ] LUCENE-7588 : DrillSideways can now run its queries concurrently
        Hide
        mikemccand Michael McCandless added a comment -

        Thank you Emmanuel Keller!

        Show
        mikemccand Michael McCandless added a comment - Thank you Emmanuel Keller !
        Hide
        mikemccand Michael McCandless added a comment -

        Hmm the ES jenkins caught this failure:

         NOTE: reproduce with: ant test  -Dtestcase=TestParallelDrillSideways -Dtests.method=testRandom -Dtests.seed=734B3451E1B6F47B -Dtests.slow=true -Dtests.locale=ar-BH -Dtests.timezone=America/North_Dakota/Center -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
           [junit4] FAILURE 1.87s J2 | TestParallelDrillSideways.testRandom <<<
           [junit4]    > Throwable #1: org.junit.ComparisonFailure: expected:<1[00]4> but was:<1[]4>
           [junit4]    > 	at __randomizedtesting.SeedInfo.seed([734B3451E1B6F47B:107115E50D64208]:0)
           [junit4]    > 	at org.apache.lucene.facet.TestDrillSideways.verifyEquals(TestDrillSideways.java:1034)
           [junit4]    > 	at org.apache.lucene.facet.TestDrillSideways.testRandom(TestDrillSideways.java:818)
           [junit4]    > 	at java.lang.Thread.run(Thread.java:745)
        

        And it does repro for me on the current (rev 7ae9ca85d9d920db353d3d080b0cb36567e206b2) branch_6x head. Emmanuel Keller any ideas?

        Show
        mikemccand Michael McCandless added a comment - Hmm the ES jenkins caught this failure: NOTE: reproduce with: ant test -Dtestcase=TestParallelDrillSideways -Dtests.method=testRandom -Dtests.seed=734B3451E1B6F47B -Dtests.slow=true -Dtests.locale=ar-BH -Dtests.timezone=America/North_Dakota/Center -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 1.87s J2 | TestParallelDrillSideways.testRandom <<< [junit4] > Throwable #1: org.junit.ComparisonFailure: expected:<1[00]4> but was:<1[]4> [junit4] > at __randomizedtesting.SeedInfo.seed([734B3451E1B6F47B:107115E50D64208]:0) [junit4] > at org.apache.lucene.facet.TestDrillSideways.verifyEquals(TestDrillSideways.java:1034) [junit4] > at org.apache.lucene.facet.TestDrillSideways.testRandom(TestDrillSideways.java:818) [junit4] > at java.lang.Thread.run(Thread.java:745) And it does repro for me on the current (rev 7ae9ca85d9d920db353d3d080b0cb36567e206b2) branch_6x head. Emmanuel Keller any ideas?
        Hide
        ekeller Emmanuel Keller added a comment -

        Ok, I am able to reproduce the failure on my own environment. I try to fix that now.

           [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestParallelDrillSideways -Dtests.method=testRandom -Dtests.seed=734B3451E1B6F47B -Dtests.slow=true -Dtests.locale=ar-BH -Dtests.timezone=America/North_Dakota/Center -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
           [junit4] FAILURE 3.50s | TestParallelDrillSideways.testRandom <<<
           [junit4]    > Throwable #1: org.junit.ComparisonFailure: expected:<1[00]4> but was:<1[]4>
           [junit4]    > 	at __randomizedtesting.SeedInfo.seed([734B3451E1B6F47B:107115E50D64208]:0)
           [junit4]    > 	at org.apache.lucene.facet.TestDrillSideways.verifyEquals(TestDrillSideways.java:1036)
           [junit4]    > 	at org.apache.lucene.facet.TestDrillSideways.testRandom(TestDrillSideways.java:820)
           [junit4]    > 	at java.lang.Thread.run(Thread.java:745)
           [junit4]   2> Jan 06, 2017 6:09:15 PM com.carrotsearch.randomizedtesting.ThreadLeakControl checkThreadLeaks
           [junit4]   2> WARNING: Will linger awaiting termination of 1 leaked thread(s).
           [junit4]   2> Jan 06, 2017 6:09:35 PM com.carrotsearch.randomizedtesting.ThreadLeakControl checkThreadLeaks
           [junit4]   2> SEVERE: 1 thread leaked from SUITE scope at org.apache.lucene.facet.TestParallelDrillSideways: 
           [junit4]   2>    1) Thread[id=17, name=LuceneTestCase-1-thread-1, state=WAITING, group=TGRP-TestParallelDrillSideways]
           [junit4]   2>         at sun.misc.Unsafe.park(Native Method)
           [junit4]   2>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
           [junit4]   2>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
           [junit4]   2>         at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
           [junit4]   2>         at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
           [junit4]   2>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
           [junit4]   2>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
           [junit4]   2>         at java.lang.Thread.run(Thread.java:745)
           [junit4]   2> Jan 06, 2017 6:09:35 PM com.carrotsearch.randomizedtesting.ThreadLeakControl tryToInterruptAll
           [junit4]   2> INFO: Starting to interrupt leaked threads:
           [junit4]   2>    1) Thread[id=17, name=LuceneTestCase-1-thread-1, state=WAITING, group=TGRP-TestParallelDrillSideways]
           [junit4]   2> Jan 06, 2017 6:09:38 PM com.carrotsearch.randomizedtesting.ThreadLeakControl tryToInterruptAll
           [junit4]   2> SEVERE: There are still zombie threads that couldn't be terminated:
           [junit4]   2>    1) Thread[id=17, name=LuceneTestCase-1-thread-1, state=WAITING, group=TGRP-TestParallelDrillSideways]
           [junit4]   2>         at sun.misc.Unsafe.park(Native Method)
           [junit4]   2>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
           [junit4]   2>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
           [junit4]   2>         at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
           [junit4]   2>         at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
           [junit4]   2>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
           [junit4]   2>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
           [junit4]   2>         at java.lang.Thread.run(Thread.java:745)
        
        
        Show
        ekeller Emmanuel Keller added a comment - Ok, I am able to reproduce the failure on my own environment. I try to fix that now. [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestParallelDrillSideways -Dtests.method=testRandom -Dtests.seed=734B3451E1B6F47B -Dtests.slow=true -Dtests.locale=ar-BH -Dtests.timezone=America/North_Dakota/Center -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 3.50s | TestParallelDrillSideways.testRandom <<< [junit4] > Throwable #1: org.junit.ComparisonFailure: expected:<1[00]4> but was:<1[]4> [junit4] > at __randomizedtesting.SeedInfo.seed([734B3451E1B6F47B:107115E50D64208]:0) [junit4] > at org.apache.lucene.facet.TestDrillSideways.verifyEquals(TestDrillSideways.java:1036) [junit4] > at org.apache.lucene.facet.TestDrillSideways.testRandom(TestDrillSideways.java:820) [junit4] > at java.lang.Thread.run(Thread.java:745) [junit4] 2> Jan 06, 2017 6:09:15 PM com.carrotsearch.randomizedtesting.ThreadLeakControl checkThreadLeaks [junit4] 2> WARNING: Will linger awaiting termination of 1 leaked thread(s). [junit4] 2> Jan 06, 2017 6:09:35 PM com.carrotsearch.randomizedtesting.ThreadLeakControl checkThreadLeaks [junit4] 2> SEVERE: 1 thread leaked from SUITE scope at org.apache.lucene.facet.TestParallelDrillSideways: [junit4] 2> 1) Thread[id=17, name=LuceneTestCase-1-thread-1, state=WAITING, group=TGRP-TestParallelDrillSideways] [junit4] 2> at sun.misc.Unsafe.park(Native Method) [junit4] 2> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) [junit4] 2> at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) [junit4] 2> at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) [junit4] 2> at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067) [junit4] 2> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) [junit4] 2> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [junit4] 2> at java.lang.Thread.run(Thread.java:745) [junit4] 2> Jan 06, 2017 6:09:35 PM com.carrotsearch.randomizedtesting.ThreadLeakControl tryToInterruptAll [junit4] 2> INFO: Starting to interrupt leaked threads: [junit4] 2> 1) Thread[id=17, name=LuceneTestCase-1-thread-1, state=WAITING, group=TGRP-TestParallelDrillSideways] [junit4] 2> Jan 06, 2017 6:09:38 PM com.carrotsearch.randomizedtesting.ThreadLeakControl tryToInterruptAll [junit4] 2> SEVERE: There are still zombie threads that couldn't be terminated: [junit4] 2> 1) Thread[id=17, name=LuceneTestCase-1-thread-1, state=WAITING, group=TGRP-TestParallelDrillSideways] [junit4] 2> at sun.misc.Unsafe.park(Native Method) [junit4] 2> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) [junit4] 2> at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) [junit4] 2> at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) [junit4] 2> at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067) [junit4] 2> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) [junit4] 2> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [junit4] 2> at java.lang.Thread.run(Thread.java:745)
        Hide
        mikemccand Michael McCandless added a comment -
        Show
        mikemccand Michael McCandless added a comment - Thanks Emmanuel Keller !
        Hide
        ekeller Emmanuel Keller added a comment - - edited

        Both actual array and expected array contains 24 documents. But not equally sorted.

        The test expects that the retrieved ScoreDoc array is ordered. However the scores are identical for all documents.

        As we are using a multithreaded map/reduce design we can't expect that the order will be preserved.
        Michael McCandless am I right ?

        IMHO, the equality check must be modified to only check that each document are present and equals.

        Here is the current check test for the ScoreDoc array:

            for (int i = 0; i < expected.hits.size(); i++) {
              if (VERBOSE) {
                System.out.println("    hit " + i + " expected=" + expected.hits.get(i).id);
              }
              assertEquals(expected.hits.get(i).id, s.doc(actual.hits.scoreDocs[i].doc).get("id"));
              // Score should be IDENTICAL:
              assertEquals(scores.get(expected.hits.get(i).id), actual.hits.scoreDocs[i].score, 0.0f);
            }
        
        Show
        ekeller Emmanuel Keller added a comment - - edited Both actual array and expected array contains 24 documents. But not equally sorted. The test expects that the retrieved ScoreDoc array is ordered. However the scores are identical for all documents. As we are using a multithreaded map/reduce design we can't expect that the order will be preserved. Michael McCandless am I right ? IMHO, the equality check must be modified to only check that each document are present and equals. Here is the current check test for the ScoreDoc array: for ( int i = 0; i < expected.hits.size(); i++) { if (VERBOSE) { System .out.println( " hit " + i + " expected=" + expected.hits.get(i).id); } assertEquals(expected.hits.get(i).id, s.doc(actual.hits.scoreDocs[i].doc).get( "id" )); // Score should be IDENTICAL: assertEquals(scores.get(expected.hits.get(i).id), actual.hits.scoreDocs[i].score, 0.0f); }
        Hide
        ekeller Emmanuel Keller added a comment - - edited

        This patch changes the verifyEquals behaviour. It checks that the documents are present and that they are equals, regardless the order.

        Show
        ekeller Emmanuel Keller added a comment - - edited This patch changes the verifyEquals behaviour. It checks that the documents are present and that they are equals, regardless the order.
        Hide
        mikemccand Michael McCandless added a comment -

        Hmm, but in testRandom we seem to always sort by id (a unique field for each document) for each search?

        So, regardless of using a single thread for the search, or doing the map/reduce w/ N threads and merging with TopDocs.merge, the result order should have been identical, I think?

        Show
        mikemccand Michael McCandless added a comment - Hmm, but in testRandom we seem to always sort by id (a unique field for each document) for each search? So, regardless of using a single thread for the search, or doing the map/reduce w/ N threads and merging with TopDocs.merge , the result order should have been identical, I think?
        Hide
        mikemccand Michael McCandless added a comment -

        Oh I think I may see the issue! When we call TopDocs.merge in the DrillSideways#search that takes a sort, we are failing to pass that sort on to TopDocs.merge ... so it's not merge-sorting in the correct order?

        Show
        mikemccand Michael McCandless added a comment - Oh I think I may see the issue! When we call TopDocs.merge in the DrillSideways#search that takes a sort, we are failing to pass that sort on to TopDocs.merge ... so it's not merge-sorting in the correct order?
        Hide
        ekeller Emmanuel Keller added a comment -

        That's the issue. Just fixed, all tests are again ok now. I upload the patch.

        Show
        ekeller Emmanuel Keller added a comment - That's the issue. Just fixed, all tests are again ok now. I upload the patch.
        Hide
        mikemccand Michael McCandless added a comment -

        Thank you Emmanuel Keller, the patch looks great, and fixes the failing seed! I'll push shortly...

        Show
        mikemccand Michael McCandless added a comment - Thank you Emmanuel Keller , the patch looks great, and fixes the failing seed! I'll push shortly...
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1aa9c4251289e71ab8e87b03797b20f4a8fda0a5 in lucene-solr's branch refs/heads/master from Mike McCandless
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1aa9c42 ]

        LUCENE-7588: the parallell search method was failing to pass on the user's requested sort when merge-sorting all hits

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1aa9c4251289e71ab8e87b03797b20f4a8fda0a5 in lucene-solr's branch refs/heads/master from Mike McCandless [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1aa9c42 ] LUCENE-7588 : the parallell search method was failing to pass on the user's requested sort when merge-sorting all hits
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 373826a69bda27e181eae063abca658798d42cb6 in lucene-solr's branch refs/heads/branch_6x from Mike McCandless
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=373826a ]

        LUCENE-7588: the parallell search method was failing to pass on the user's requested sort when merge-sorting all hits

        Show
        jira-bot ASF subversion and git services added a comment - Commit 373826a69bda27e181eae063abca658798d42cb6 in lucene-solr's branch refs/heads/branch_6x from Mike McCandless [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=373826a ] LUCENE-7588 : the parallell search method was failing to pass on the user's requested sort when merge-sorting all hits

          People

          • Assignee:
            Unassigned
            Reporter:
            ekeller Emmanuel Keller
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development