Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-10063

CoreContainer shutdown has race condition that can cause a hang on shutdown.

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 6.5, 7.0
    • Component/s: None
    • Security Level: Public (Default Security Level. Issues are Public)
    • Labels:
      None
    1. im-patch-1.png
      135 kB
      Mark Miller

      Issue Links

        Activity

        Hide
        cpoerschke Christine Poerschke added a comment -

        Hi Mark, could you share the exact beast command(s) you used for the SOLR-10032 report? I just ran

        ant beast -Dbeast.iters=100 -Dtestcase=TestShardHandlerFactory
        

        here for TestShardHandlerFactory and it was fine and the test itself seems very uninteresting and hence am wondering about interaction between tests or how to more aggressively beast? Thanks.

        Show
        cpoerschke Christine Poerschke added a comment - Hi Mark, could you share the exact beast command(s) you used for the SOLR-10032 report? I just ran ant beast -Dbeast.iters=100 -Dtestcase=TestShardHandlerFactory here for TestShardHandlerFactory and it was fine and the test itself seems very uninteresting and hence am wondering about interaction between tests or how to more aggressively beast? Thanks.
        Hide
        markrmiller@gmail.com Mark Miller added a comment -

        This is due to a race on shutdown. On shutdown the code breaks waits with an interruptAll and then assumes things are all good, but something can go into wait right after the interruptAll. A workaround is:

        Show
        markrmiller@gmail.com Mark Miller added a comment - This is due to a race on shutdown. On shutdown the code breaks waits with an interruptAll and then assumes things are all good, but something can go into wait right after the interruptAll. A workaround is:
        Hide
        markrmiller@gmail.com Mark Miller added a comment -

        ant beast -Dbeast.iters=100 -Dtestcase=TestShardHandlerFactory

        I think I remember you figured out the parallel argument in another thread.

        With my beast script, I would see this about one in 60 runs doing 10 in parallel for 30 total iterations.

        The above prevents the hang so far in my test beasting, but it's not really foolproof either.

        Show
        markrmiller@gmail.com Mark Miller added a comment - ant beast -Dbeast.iters=100 -Dtestcase=TestShardHandlerFactory I think I remember you figured out the parallel argument in another thread. With my beast script, I would see this about one in 60 runs doing 10 in parallel for 30 total iterations. The above prevents the hang so far in my test beasting, but it's not really foolproof either.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 5b976959d9ed84509dbc2724d89bad0142436b22 in lucene-solr's branch refs/heads/master from markrmiller
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5b97695 ]

        SOLR-10063: CoreContainer shutdown has race condition that can cause a hang on shutdown.

        Show
        jira-bot ASF subversion and git services added a comment - Commit 5b976959d9ed84509dbc2724d89bad0142436b22 in lucene-solr's branch refs/heads/master from markrmiller [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5b97695 ] SOLR-10063 : CoreContainer shutdown has race condition that can cause a hang on shutdown.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 16d0ee3a4c7476224ec2b93e366b776c0d24d0b1 in lucene-solr's branch refs/heads/branch_6x from markrmiller
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=16d0ee3 ]

        SOLR-10063: CoreContainer shutdown has race condition that can cause a hang on shutdown.

        1. Conflicts:
        2. solr/CHANGES.txt
        Show
        jira-bot ASF subversion and git services added a comment - Commit 16d0ee3a4c7476224ec2b93e366b776c0d24d0b1 in lucene-solr's branch refs/heads/branch_6x from markrmiller [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=16d0ee3 ] SOLR-10063 : CoreContainer shutdown has race condition that can cause a hang on shutdown. Conflicts: solr/CHANGES.txt

          People

          • Assignee:
            markrmiller@gmail.com Mark Miller
            Reporter:
            markrmiller@gmail.com Mark Miller
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development