Solr
  1. Solr
  2. SOLR-6482

Add an onlyIfDown flag for DELETEREPLICA collections API command

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 5.0, 6.0
    • Fix Version/s: 5.0, 6.0
    • Component/s: SolrCloud
    • Labels:
      None

      Description

      Having the DELETEREPLICA delete the index is scary for some situations, especially ones in which the operations people are taking more explicit control of the topology of their cluster. As we move towards ZK being the "one source of truth" and deleting replicas that then come back up, this is even scarier.

      I propose to have an optional flag onlyIfDown that remove the replica from the ZK cluster state if (and only if) the node was offline. Default value: false.

      1. SOLR-6482.patch
        5 kB
        Erick Erickson
      2. SOLR-6482.patch
        6 kB
        Erick Erickson
      3. SOLR-6482.patch
        4 kB
        Erick Erickson
      4. SOLR-6482-test.patch
        4 kB
        Erick Erickson

        Activity

        Hide
        Erick Erickson added a comment -

        Here's a first cut at the patch, still need to add tests but this seems to work when I tested it manually.

        Its pretty simple all told, it seems like it shouldn't interfere with the "ZK as truth" model.

        Show
        Erick Erickson added a comment - Here's a first cut at the patch, still need to add tests but this seems to work when I tested it manually. Its pretty simple all told, it seems like it shouldn't interfere with the "ZK as truth" model.
        Hide
        Erick Erickson added a comment -

        Patch with a test as well, the test shamelessly piggy-backed on an existing test....

        This may be ready to commit, all tests pass, but I'd like to give people a chance to comment.

        Show
        Erick Erickson added a comment - Patch with a test as well, the test shamelessly piggy-backed on an existing test.... This may be ready to commit, all tests pass, but I'd like to give people a chance to comment.
        Hide
        Erick Erickson added a comment -

        Final patch with CHANGES.txt added

        Show
        Erick Erickson added a comment - Final patch with CHANGES.txt added
        Hide
        ASF subversion and git services added a comment -

        Commit 1624930 from Erick Erickson in branch 'dev/trunk'
        [ https://svn.apache.org/r1624930 ]

        SOLR-6482: Add an onlyIfDown flag for DELETEREPLICA collections API command

        Show
        ASF subversion and git services added a comment - Commit 1624930 from Erick Erickson in branch 'dev/trunk' [ https://svn.apache.org/r1624930 ] SOLR-6482 : Add an onlyIfDown flag for DELETEREPLICA collections API command
        Hide
        Erick Erickson added a comment -

        Hmmm, don't ask me how, but somehow my latest patch didn't have the tests in it, I'll fix today.

        Show
        Erick Erickson added a comment - Hmmm, don't ask me how, but somehow my latest patch didn't have the tests in it, I'll fix today.
        Hide
        Shalin Shekhar Mangar added a comment -

        I'm a little confused about this feature. You say that deleting index automatically is scary and then you talk about ZK as truth and about replicas coming back up. What is the use-case behind onlyIfDown? Why would anyone invoke DELETEREPLICA against a replica if they don't want it to be removed from cluster state?

        Show
        Shalin Shekhar Mangar added a comment - I'm a little confused about this feature. You say that deleting index automatically is scary and then you talk about ZK as truth and about replicas coming back up. What is the use-case behind onlyIfDown? Why would anyone invoke DELETEREPLICA against a replica if they don't want it to be removed from cluster state?
        Hide
        Mark Miller added a comment -

        I'm confused by this as well.

        Show
        Mark Miller added a comment - I'm confused by this as well.
        Hide
        Erick Erickson added a comment -

        @Shalin:

        bq: Why would anyone invoke DELETEREPLICA against a replica if they don't want it to be removed from cluster state?

        It's really the opposite, something of a safety valve. It's a situation where you only want to affect the cluster state, you do not want to actually delete a replica if it's live. Particularly for cleanup when a machine has died and it's never coming back.

        I'm looking at it as a paranoia flag for DELETEREPLICA that expresses "if I screwed up and called this on a live replica, ignore the command". Operationally, it's scary to turn a script loose for maintenance that may, through a programming error or whatever, delete all the nodes on my system.

        One can argue that all this can be done by examining the cluster state and issuing the delete replica only for nodes that are down and not need to add a flag to DELETEREPLICA and I suppose that's true. But operations folks would like this kind of safety valve.

        I'm not quite sure how this plays out in the ZK being "the one source of truth" model, when we get there it may be irrelevant. But we're not there yet, it's completely optional, and if omitted the behavior is the same as now so it's not a big change.

        Show
        Erick Erickson added a comment - @Shalin: bq: Why would anyone invoke DELETEREPLICA against a replica if they don't want it to be removed from cluster state? It's really the opposite, something of a safety valve. It's a situation where you only want to affect the cluster state, you do not want to actually delete a replica if it's live. Particularly for cleanup when a machine has died and it's never coming back. I'm looking at it as a paranoia flag for DELETEREPLICA that expresses "if I screwed up and called this on a live replica, ignore the command". Operationally, it's scary to turn a script loose for maintenance that may, through a programming error or whatever, delete all the nodes on my system. One can argue that all this can be done by examining the cluster state and issuing the delete replica only for nodes that are down and not need to add a flag to DELETEREPLICA and I suppose that's true. But operations folks would like this kind of safety valve. I'm not quite sure how this plays out in the ZK being "the one source of truth" model, when we get there it may be irrelevant. But we're not there yet, it's completely optional, and if omitted the behavior is the same as now so it's not a big change.
        Hide
        ASF subversion and git services added a comment -

        Commit 1625441 from Erick Erickson in branch 'dev/trunk'
        [ https://svn.apache.org/r1625441 ]

        SOLR-6482: Omitted tests

        Show
        ASF subversion and git services added a comment - Commit 1625441 from Erick Erickson in branch 'dev/trunk' [ https://svn.apache.org/r1625441 ] SOLR-6482 : Omitted tests
        Hide
        Erick Erickson added a comment -

        Patch for the somehow-omitted files. Should be applied after SOLR-6482.patch

        Show
        Erick Erickson added a comment - Patch for the somehow-omitted files. Should be applied after SOLR-6482 .patch
        Hide
        ASF subversion and git services added a comment -

        Commit 1625621 from Erick Erickson in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1625621 ]

        SOLR-6482: Add an onlyIfDown flag for DELETEREPLICA collections API command

        Show
        ASF subversion and git services added a comment - Commit 1625621 from Erick Erickson in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1625621 ] SOLR-6482 : Add an onlyIfDown flag for DELETEREPLICA collections API command
        Hide
        Erick Erickson added a comment -

        Incorporates the forgotten test too

        Show
        Erick Erickson added a comment - Incorporates the forgotten test too
        Hide
        Anshum Gupta added a comment -

        Bulk close after 5.0 release.

        Show
        Anshum Gupta added a comment - Bulk close after 5.0 release.

          People

          • Assignee:
            Erick Erickson
            Reporter:
            Erick Erickson
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development