Solr
  1. Solr
  2. SOLR-7503

Recovery after ZK session expiration happens in a single thread for all cores in a node

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 5.1
    • Fix Version/s: 5.2, 6.0
    • Component/s: SolrCloud
    • Labels:

      Description

      Currently cores are registered in parallel in an executor. However, when there's a ZK expiration, the recovery, which also happens in the register call, happens in a single thread:

      https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/cloud/ZkController.java#L300

      We should make these happen in parallel as well so that recovery after ZK expiration doesn't take forever.

      Thanks to Jessica Cheng Mallet for catching this.

      1. SOLR-7503.patch
        4 kB
        Timothy Potter

        Activity

        Hide
        Timothy Potter added a comment -

        Cooking up a patch now.

        Show
        Timothy Potter added a comment - Cooking up a patch now.
        Hide
        Timothy Potter added a comment -

        Simple patch that registers cores in the background after ZK session expiration. I had to add some getter methods for the ExecutionService in the ZkContainer so that it is available to the ZkController when needed (iff cc is not null). I didn't want to use a new ExecutionService since the one setup by ZkContainer seemed most appropriate for this work, but you can't expose ZkContainer directly in ZkController because it's only a server-side thing.

        Show
        Timothy Potter added a comment - Simple patch that registers cores in the background after ZK session expiration. I had to add some getter methods for the ExecutionService in the ZkContainer so that it is available to the ZkController when needed (iff cc is not null). I didn't want to use a new ExecutionService since the one setup by ZkContainer seemed most appropriate for this work, but you can't expose ZkContainer directly in ZkController because it's only a server-side thing.
        Hide
        Shalin Shekhar Mangar added a comment -

        +1 LGTM

        Show
        Shalin Shekhar Mangar added a comment - +1 LGTM
        Hide
        Mark Miller added a comment -

        +1

        Show
        Mark Miller added a comment - +1
        Hide
        ASF subversion and git services added a comment -

        Commit 1679607 from Timothy Potter in branch 'dev/trunk'
        [ https://svn.apache.org/r1679607 ]

        SOLR-7503: Recovery after ZK session expiration should happen in parallel for all cores using the thread-pool managed by ZkContainer instead of a single thread.

        Show
        ASF subversion and git services added a comment - Commit 1679607 from Timothy Potter in branch 'dev/trunk' [ https://svn.apache.org/r1679607 ] SOLR-7503 : Recovery after ZK session expiration should happen in parallel for all cores using the thread-pool managed by ZkContainer instead of a single thread.
        Hide
        ASF subversion and git services added a comment -

        Commit 1679609 from Timothy Potter in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1679609 ]

        SOLR-7503: Recovery after ZK session expiration should happen in parallel for all cores using the thread-pool managed by ZkContainer instead of a single thread.

        Show
        ASF subversion and git services added a comment - Commit 1679609 from Timothy Potter in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1679609 ] SOLR-7503 : Recovery after ZK session expiration should happen in parallel for all cores using the thread-pool managed by ZkContainer instead of a single thread.
        Hide
        Shalin Shekhar Mangar added a comment -

        This should go under the "Bug fixes" section of the CHANGES.txt instead of "Other changes"

        Show
        Shalin Shekhar Mangar added a comment - This should go under the "Bug fixes" section of the CHANGES.txt instead of "Other changes"
        Hide
        ASF subversion and git services added a comment -

        Commit 1680106 from Timothy Potter in branch 'dev/trunk'
        [ https://svn.apache.org/r1680106 ]

        SOLR-7503: move changes text to bug fixes section was in other changes

        Show
        ASF subversion and git services added a comment - Commit 1680106 from Timothy Potter in branch 'dev/trunk' [ https://svn.apache.org/r1680106 ] SOLR-7503 : move changes text to bug fixes section was in other changes
        Hide
        ASF subversion and git services added a comment -

        Commit 1680107 from Timothy Potter in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1680107 ]

        SOLR-7503: move changes text to bug fixes section was in other changes

        Show
        ASF subversion and git services added a comment - Commit 1680107 from Timothy Potter in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1680107 ] SOLR-7503 : move changes text to bug fixes section was in other changes
        Hide
        Anshum Gupta added a comment -

        Bulk close for 5.2.0.

        Show
        Anshum Gupta added a comment - Bulk close for 5.2.0.

          People

          • Assignee:
            Timothy Potter
            Reporter:
            Shalin Shekhar Mangar
          • Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development