Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-6665

ZkController.publishAndWaitForDownStates should not use core name

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 4.10.1, 4.10.4, 5.1
    • Fix Version/s: 5.2, 6.0
    • Component/s: SolrCloud
    • Labels:
      None

      Description

      ZkController.publishAndWaitForDownStates uses a List<String> to keep track of all core names that have been published as down. It should use a set of coreNodeNames instead of core names for correctness.

      1. SOLR-6665.patch
        7 kB
        Shalin Shekhar Mangar

        Issue Links

          Activity

          Hide
          shalinmangar Shalin Shekhar Mangar added a comment -

          Fix and test to demonstrate the bug.

          Since the current code uses a core name and doesn't check for the node name in the verification phase, it can easily be fooled if a replica with the same core name existed on a different node. The test asserts that this method times out. It makes the test always take at least 60s (the timeout value for the method) but I can't find a better way.

          Show
          shalinmangar Shalin Shekhar Mangar added a comment - Fix and test to demonstrate the bug. Since the current code uses a core name and doesn't check for the node name in the verification phase, it can easily be fooled if a replica with the same core name existed on a different node. The test asserts that this method times out. It makes the test always take at least 60s (the timeout value for the method) but I can't find a better way.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1675030 from shalin@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1675030 ]

          SOLR-6665: ZkController.publishAndWaitForDownStates can return before all local cores are marked as 'down' if multiple replicas with the same core name exist in the cluster

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1675030 from shalin@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1675030 ] SOLR-6665 : ZkController.publishAndWaitForDownStates can return before all local cores are marked as 'down' if multiple replicas with the same core name exist in the cluster
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1675033 from shalin@apache.org in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1675033 ]

          SOLR-6665: ZkController.publishAndWaitForDownStates can return before all local cores are marked as 'down' if multiple replicas with the same core name exist in the cluster

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1675033 from shalin@apache.org in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1675033 ] SOLR-6665 : ZkController.publishAndWaitForDownStates can return before all local cores are marked as 'down' if multiple replicas with the same core name exist in the cluster
          Hide
          shalinmangar Shalin Shekhar Mangar added a comment -

          This is fixed. I had to change the ZkControllerTest in branch_5x to make it compliant with Java7.

          Show
          shalinmangar Shalin Shekhar Mangar added a comment - This is fixed. I had to change the ZkControllerTest in branch_5x to make it compliant with Java7.
          Hide
          shalinmangar Shalin Shekhar Mangar added a comment -

          This test isn't correct. I failed to account for the fact that the overseer is started automatically and it can race the test thread to publish the status immediately.

          Show
          shalinmangar Shalin Shekhar Mangar added a comment - This test isn't correct. I failed to account for the fact that the overseer is started automatically and it can race the test thread to publish the status immediately.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1675067 from shalin@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1675067 ]

          SOLR-6665: Add AwaitsFix annotation to the new test

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1675067 from shalin@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1675067 ] SOLR-6665 : Add AwaitsFix annotation to the new test
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1675068 from shalin@apache.org in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1675068 ]

          SOLR-6665: Add AwaitsFix annotation to the new test

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1675068 from shalin@apache.org in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1675068 ] SOLR-6665 : Add AwaitsFix annotation to the new test
          Hide
          shalinmangar Shalin Shekhar Mangar added a comment -

          This fix has already been released so I opened SOLR-7736 to fix the test.

          Show
          shalinmangar Shalin Shekhar Mangar added a comment - This fix has already been released so I opened SOLR-7736 to fix the test.

            People

            • Assignee:
              shalinmangar Shalin Shekhar Mangar
              Reporter:
              shalinmangar Shalin Shekhar Mangar
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development