Solr
  1. Solr
  2. SOLR-7248

In legacyCloud=false mode we should check if the core was hosted on the same node before registering it

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 5.0
    • Fix Version/s: 5.1, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      Related discussion here - http://markmail.org/message/n32mxbv42hzuneyy

      Currently we check if the same coreNodeName is present in clusterstate before registering it. We should make this check more stringent and allow a core to be registered only if it the coreNodeName is present and if it's on the same node.

      This will ensure that junk replica folders lying around on old nodes don't end up registering themselves when the node gets bounced.

      1. SOLR-7248.patch
        2 kB
        Varun Thacker
      2. SOLR-7248.patch
        2 kB
        Varun Thacker

        Issue Links

          Activity

          Hide
          Varun Thacker added a comment -

          Patch which checks both coreNodeName and base_url to verify if the core is already present in the clusterstate.

          Also refactored that method.

          Show
          Varun Thacker added a comment - Patch which checks both coreNodeName and base_url to verify if the core is already present in the clusterstate. Also refactored that method.
          Hide
          Noble Paul added a comment -

          compare both baseurl and corename

          Show
          Noble Paul added a comment - compare both baseurl and corename
          Hide
          Varun Thacker added a comment -

          Updated patch which compares both 'base_url' and 'name'

          Show
          Varun Thacker added a comment - Updated patch which compares both 'base_url' and 'name'
          Hide
          Mark Miller added a comment -

          In fact, I think we need to reconsider legacyCloud.

          We have reserved the right to add ZK == truth features by default in 5 releases.

          We may want to first add and improve them behind legacyCloud as an option and once we have confidence in them, move them to default? Or we may want to keep everything behind legacyCloud for all of 5. I would prefer to start doing zk == truth by default - when you don't support pre configuring SolrCore's (As we say we won't in 5.0 CHANGES.txt), most of these changes are really fixing what a user would perceive as a bug.

          Show
          Mark Miller added a comment - In fact, I think we need to reconsider legacyCloud. We have reserved the right to add ZK == truth features by default in 5 releases. We may want to first add and improve them behind legacyCloud as an option and once we have confidence in them, move them to default? Or we may want to keep everything behind legacyCloud for all of 5. I would prefer to start doing zk == truth by default - when you don't support pre configuring SolrCore's (As we say we won't in 5.0 CHANGES.txt), most of these changes are really fixing what a user would perceive as a bug.
          Hide
          Varun Thacker added a comment -

          We may want to first add and improve them behind legacyCloud as an option and once we have confidence in them, move them to default?

          +1.

          What are the things that you have in mind that we could add to make ZK as truth?

          Show
          Varun Thacker added a comment - We may want to first add and improve them behind legacyCloud as an option and once we have confidence in them, move them to default? +1. What are the things that you have in mind that we could add to make ZK as truth?
          Hide
          Mark Miller added a comment -

          Just simple stuff - some of it may already happen with legacyCloud=true, but I know there are not enough tests for it, nor is it completely done.

          Basically though, you shouldn't be able to create a core for a collection if that collection does not exist. So for example, on startup, any core that is part not part of a collection in zk should be removed. Likewise, if ZooKeeper says a node should host a SolrCore and it does not, it should be created (given their is a leader to replicate from or when using a shared filesystem).

          Basically, all the individual Solr instances should make the appropriate local adjustments to stay in sync with what ZooKeeper describes as the current cluster.

          Show
          Mark Miller added a comment - Just simple stuff - some of it may already happen with legacyCloud=true, but I know there are not enough tests for it, nor is it completely done. Basically though, you shouldn't be able to create a core for a collection if that collection does not exist. So for example, on startup, any core that is part not part of a collection in zk should be removed. Likewise, if ZooKeeper says a node should host a SolrCore and it does not, it should be created (given their is a leader to replicate from or when using a shared filesystem). Basically, all the individual Solr instances should make the appropriate local adjustments to stay in sync with what ZooKeeper describes as the current cluster.
          Hide
          Varun Thacker added a comment -

          Thanks Mark Miller for putting down whats needed. I created https://issues.apache.org/jira/browse/SOLR-7269 to track it.

          We could commit this patch as is right?

          I also found SOLR-7251 while doing some manual testing for this patch.

          Show
          Varun Thacker added a comment - Thanks Mark Miller for putting down whats needed. I created https://issues.apache.org/jira/browse/SOLR-7269 to track it. We could commit this patch as is right? I also found SOLR-7251 while doing some manual testing for this patch.
          Hide
          Mark Miller added a comment -

          Yeah, I was just chiming in here because of the legacyCloud connection.

          Show
          Mark Miller added a comment - Yeah, I was just chiming in here because of the legacyCloud connection.
          Hide
          ASF subversion and git services added a comment -

          Commit 1668931 from Varun Thacker in branch 'dev/trunk'
          [ https://svn.apache.org/r1668931 ]

          SOLR-7248: In legacyCloud=false mode we should check if the core was hosted on the same node before registering it

          Show
          ASF subversion and git services added a comment - Commit 1668931 from Varun Thacker in branch 'dev/trunk' [ https://svn.apache.org/r1668931 ] SOLR-7248 : In legacyCloud=false mode we should check if the core was hosted on the same node before registering it
          Hide
          ASF subversion and git services added a comment -

          Commit 1668953 from Varun Thacker in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1668953 ]

          SOLR-7248: In legacyCloud=false mode we should check if the core was hosted on the same node before registering it (merging from trunk)

          Show
          ASF subversion and git services added a comment - Commit 1668953 from Varun Thacker in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1668953 ] SOLR-7248 : In legacyCloud=false mode we should check if the core was hosted on the same node before registering it (merging from trunk)
          Hide
          Varun Thacker added a comment -

          Thanks Noble and Mark.

          Show
          Varun Thacker added a comment - Thanks Noble and Mark.
          Hide
          Timothy Potter added a comment -

          Bulk close after 5.1 release

          Show
          Timothy Potter added a comment - Bulk close after 5.1 release

            People

            • Assignee:
              Varun Thacker
              Reporter:
              Varun Thacker
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development