Solr
  1. Solr
  2. SOLR-2610

Add an option to delete index through CoreAdmin UNLOAD action

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.3, 4.0-ALPHA
    • Component/s: multicore
    • Labels:
      None

      Description

      Right now, one can unload a Solr Core but the index files are left behind and consume disk space. We should have an option to delete the index when unloading a core.

      1. SOLR-2610-branch3x.patch
        11 kB
        Shalin Shekhar Mangar
      2. SOLR-2610.patch
        11 kB
        Shalin Shekhar Mangar

        Issue Links

          Activity

          Hide
          Robert Muir added a comment -

          Bulk close for 3.3

          Show
          Robert Muir added a comment - Bulk close for 3.3
          Hide
          Shawn Heisey added a comment -

          Shawn, that is not a use-case for RELOAD. The idea behind it is to reload an existing core's index with updated configuration changes and swap it with the existing core without causing downtime. It seems like your use-case is handled well with the stock CREATE, SWAP and UNLOAD+deleteIndex?

          CREATE requires that the caller be aware of internal server filesystem structures. For the typical use of CREATE, this is not really a problem, but if what you're trying to do is unload a core, delete its index, and then immediately recreate it with the same config, it would be very nice to not have to specify (or even know) the solr.xml configuration bits.

          In this particular case, the person who writes the scripts is the same person who maintains the Solr infrastructure (me) ... but that might not always be the case. Currently the build scripts don't know anything about the internal structure other than core names, and I'd like to keep it that way.

          Adding an option like deleteIndex to RELOAD seemed a logical way to handle this, since currently (1.4.1) I have to completely restart Solr when I wipe out an index directory. If this is not a logical progression, I would argue that CoreAdmin needs an entirely new action. Either way, if it's deemed desirable, it needs its own Jira issue. I brought it up here because it's at least tangentially related.

          Show
          Shawn Heisey added a comment - Shawn, that is not a use-case for RELOAD. The idea behind it is to reload an existing core's index with updated configuration changes and swap it with the existing core without causing downtime. It seems like your use-case is handled well with the stock CREATE, SWAP and UNLOAD+deleteIndex? CREATE requires that the caller be aware of internal server filesystem structures. For the typical use of CREATE, this is not really a problem, but if what you're trying to do is unload a core, delete its index, and then immediately recreate it with the same config, it would be very nice to not have to specify (or even know) the solr.xml configuration bits. In this particular case, the person who writes the scripts is the same person who maintains the Solr infrastructure (me) ... but that might not always be the case. Currently the build scripts don't know anything about the internal structure other than core names, and I'd like to keep it that way. Adding an option like deleteIndex to RELOAD seemed a logical way to handle this, since currently (1.4.1) I have to completely restart Solr when I wipe out an index directory. If this is not a logical progression, I would argue that CoreAdmin needs an entirely new action. Either way, if it's deemed desirable, it needs its own Jira issue. I brought it up here because it's at least tangentially related.
          Hide
          Mark Miller added a comment -

          And I got a +1 from you and Jason on the topic of the issue (or at least, that's what I assumed). I waited a day to commit - would you like me to wait longer for future issues or leave a comment to that effect?

          No, I think a day is fine - just warning perhaps? Both Jason and I liked the idea, but it just seemed like we where discussing some of the details and you committed kind of without warning. I'm not that concerned about it, just mentioning it.

          If the patch is not what you intended, go ahead and reopen/extend the scope of the issue or open another issue.

          I think the patch is fine - I've tweaked a couple little things on the changes entry, but the patch itself looks good so far. I opened SOLR-2621 to continue the other 'delete options' discussion.

          Show
          Mark Miller added a comment - And I got a +1 from you and Jason on the topic of the issue (or at least, that's what I assumed). I waited a day to commit - would you like me to wait longer for future issues or leave a comment to that effect? No, I think a day is fine - just warning perhaps? Both Jason and I liked the idea, but it just seemed like we where discussing some of the details and you committed kind of without warning. I'm not that concerned about it, just mentioning it. If the patch is not what you intended, go ahead and reopen/extend the scope of the issue or open another issue. I think the patch is fine - I've tweaked a couple little things on the changes entry, but the patch itself looks good so far. I opened SOLR-2621 to continue the other 'delete options' discussion.
          Hide
          Shalin Shekhar Mangar added a comment -

          where do you mention how this helps with SolrCloud?

          I didn't and I'm sorry about that. I was just trying to tell you my perspective. These are small pieces that need to be fixed before tackling larger problems in SolrCloud and this one seemed generally useful and simple enough by itself that I opened the issue without giving the bigger picture. Some of the other pieces are captured in SOLR-2595

          Why are you deleting cores only to add them back again with the same config?

          Hopefully SOLR-2595 will give you a better idea of what I was thinking. The use-case is to split and migrate pieces of an index and this issue will help in deleting the leftover temporary cores.

          Do you really think it's inconsistent to actually be able to delete something?

          The inconsistency is to be able to delete a configuration file when there is no way to add it back but I'm not against the feature in general.

          Does it really seem like a weird use case to say, I want to delete a SolrCore I no longer have an interest in?

          Absolutely not. If you want that feature, that's fine. You don't need permissions to put up a patch and commit it

          Looks like a few people have an interest in this issue, so I'm not sure why you rammed it in so quickly.

          The issue clearly talks about deleting index on unload and that's what it does. And I got a +1 from you and Jason on the topic of the issue (or at least, that's what I assumed). I waited a day to commit - would you like me to wait longer for future issues or leave a comment to that effect? If the patch is not what you intended, go ahead and reopen/extend the scope of the issue or open another issue.

          Show
          Shalin Shekhar Mangar added a comment - where do you mention how this helps with SolrCloud? I didn't and I'm sorry about that. I was just trying to tell you my perspective. These are small pieces that need to be fixed before tackling larger problems in SolrCloud and this one seemed generally useful and simple enough by itself that I opened the issue without giving the bigger picture. Some of the other pieces are captured in SOLR-2595 Why are you deleting cores only to add them back again with the same config? Hopefully SOLR-2595 will give you a better idea of what I was thinking. The use-case is to split and migrate pieces of an index and this issue will help in deleting the leftover temporary cores. Do you really think it's inconsistent to actually be able to delete something? The inconsistency is to be able to delete a configuration file when there is no way to add it back but I'm not against the feature in general. Does it really seem like a weird use case to say, I want to delete a SolrCore I no longer have an interest in? Absolutely not. If you want that feature, that's fine. You don't need permissions to put up a patch and commit it Looks like a few people have an interest in this issue, so I'm not sure why you rammed it in so quickly. The issue clearly talks about deleting index on unload and that's what it does. And I got a +1 from you and Jason on the topic of the issue (or at least, that's what I assumed). I waited a day to commit - would you like me to wait longer for future issues or leave a comment to that effect? If the patch is not what you intended, go ahead and reopen/extend the scope of the issue or open another issue.
          Hide
          Mark Miller added a comment -

          I was approaching this particular issue more from the angle of making it useful for SolrCloud.

          where do you mention how this helps with SolrCloud?

          I can see how deleting configs can be useful to some people but is it worth introducing such an inconsistency i.e. you can delete config but cannot add it back? Anyways, it is best handled via a separate issue.

          Why are you deleting cores only to add them back again with the same config? Do you really think it's inconsistent to actually be able to delete something? Does it really seem like a weird use case to say, I want to delete a SolrCore I no longer have an interest in?

          Looks like a few people have an interest in this issue, so I'm not sure why you rammed it in so quickly.

          Show
          Mark Miller added a comment - I was approaching this particular issue more from the angle of making it useful for SolrCloud. where do you mention how this helps with SolrCloud? I can see how deleting configs can be useful to some people but is it worth introducing such an inconsistency i.e. you can delete config but cannot add it back? Anyways, it is best handled via a separate issue. Why are you deleting cores only to add them back again with the same config? Do you really think it's inconsistent to actually be able to delete something? Does it really seem like a weird use case to say, I want to delete a SolrCore I no longer have an interest in? Looks like a few people have an interest in this issue, so I'm not sure why you rammed it in so quickly.
          Hide
          Shalin Shekhar Mangar added a comment -

          I can think of a corollary core action I'd like to see – the ability on a core RELOAD to entirely delete the index from a core and replace it with a fresh empty index that will start building at segment _0. I would do this to my "build" core before using it, and later after swapping it with the "live" core and ensuring it's good, to free up disk space.

          Shawn, that is not a use-case for RELOAD. The idea behind it is to reload an existing core's index with updated configuration changes and swap it with the existing core without causing downtime. It seems like your use-case is handled well with the stock CREATE, SWAP and UNLOAD+deleteIndex?

          Show
          Shalin Shekhar Mangar added a comment - I can think of a corollary core action I'd like to see – the ability on a core RELOAD to entirely delete the index from a core and replace it with a fresh empty index that will start building at segment _0. I would do this to my "build" core before using it, and later after swapping it with the "live" core and ensuring it's good, to free up disk space. Shawn, that is not a use-case for RELOAD. The idea behind it is to reload an existing core's index with updated configuration changes and swap it with the existing core without causing downtime. It seems like your use-case is handled well with the stock CREATE, SWAP and UNLOAD+deleteIndex?
          Hide
          Shalin Shekhar Mangar added a comment -

          But you might want to (in fact, I do this). If you are really done with a core, if you really want to remove it, what do you need the config files around for anymore?

          I was approaching this particular issue more from the angle of making it useful for SolrCloud. I can see how deleting configs can be useful to some people but is it worth introducing such an inconsistency i.e. you can delete config but cannot add it back? Anyways, it is best handled via a separate issue.

          Show
          Shalin Shekhar Mangar added a comment - But you might want to (in fact, I do this). If you are really done with a core, if you really want to remove it, what do you need the config files around for anymore? I was approaching this particular issue more from the angle of making it useful for SolrCloud. I can see how deleting configs can be useful to some people but is it worth introducing such an inconsistency i.e. you can delete config but cannot add it back? Anyways, it is best handled via a separate issue.
          Hide
          Shawn Heisey added a comment -

          I can think of a corollary core action I'd like to see – the ability on a core RELOAD to entirely delete the index from a core and replace it with a fresh empty index that will start building at segment _0. I would do this to my "build" core before using it, and later after swapping it with the "live" core and ensuring it's good, to free up disk space.

          Show
          Shawn Heisey added a comment - I can think of a corollary core action I'd like to see – the ability on a core RELOAD to entirely delete the index from a core and replace it with a fresh empty index that will start building at segment _0. I would do this to my "build" core before using it, and later after swapping it with the "live" core and ensuring it's good, to free up disk space.
          Hide
          Jason Rutherglen added a comment -

          Mark put it aptly. The problem I think I encountered in my own version is left over file handles seemed to be preventing the deletion of all the files, many times some of them would be left over. Also I deleted the entire core directory, which is useful for manual testing (eg, to avoid the directory exists exception).

          Show
          Jason Rutherglen added a comment - Mark put it aptly. The problem I think I encountered in my own version is left over file handles seemed to be preventing the deletion of all the files, many times some of them would be left over. Also I deleted the entire core directory, which is useful for manual testing (eg, to avoid the directory exists exception).
          Hide
          Mark Miller added a comment -

          But you might want to (in fact, I do this). If you are really done with a core, if you really want to remove it, what do you need the config files around for anymore? Seems like a reasonable option to me - makes no sense as the default I'd agree with.

          nukeEverything=true

          Show
          Mark Miller added a comment - But you might want to (in fact, I do this). If you are really done with a core, if you really want to remove it, what do you need the config files around for anymore? Seems like a reasonable option to me - makes no sense as the default I'd agree with. nukeEverything=true
          Hide
          Shalin Shekhar Mangar added a comment -

          Which other files do you want to remove? In order to create a core, all required configuration files must already be present on the disk. I did not want to remove files during unload which I cannot later add to a host through the admin interfaces.

          Show
          Shalin Shekhar Mangar added a comment - Which other files do you want to remove? In order to create a core, all required configuration files must already be present on the disk. I did not want to remove files during unload which I cannot later add to a host through the admin interfaces.
          Hide
          Jason Rutherglen added a comment -

          Just reviewed the patch, I think we need an additional option to remove all files related to the core. This is useful for manual core movement.

          Show
          Jason Rutherglen added a comment - Just reviewed the patch, I think we need an additional option to remove all files related to the core. This is useful for manual core movement.
          Hide
          Shalin Shekhar Mangar added a comment -

          Committed revision 1138405 on trunk and 1138407 on branch_3x.

          Show
          Shalin Shekhar Mangar added a comment - Committed revision 1138405 on trunk and 1138407 on branch_3x.
          Hide
          Shalin Shekhar Mangar added a comment -

          Patch for branch 3x

          Show
          Shalin Shekhar Mangar added a comment - Patch for branch 3x
          Hide
          Jason Rutherglen added a comment -

          This is good! I had to write the same functionality into a custom Solr build on a project.

          Show
          Jason Rutherglen added a comment - This is good! I had to write the same functionality into a custom Solr build on a project.
          Hide
          Shalin Shekhar Mangar added a comment -

          Patch adds a boolean "deleteIndex" parameter to core unload action.

          There is a close hook interface in SolrCore but it is called before the update handler and searcher(s) are closed so it cannot be used to delete the index.

          Changes:

          • Changes the CloseHook interface to an abstract class with a preClose(SolrCore) and a postClose(SolrCore) method
          • Changed the usage of CloseHook in ReplicationHandler, SolrCoreTest
          • CoreAdminHandler adds a closehook on receiving an unload action with deleteIndex=true
          • Added tests for the new param

          Since the CloseHook is used very sparingly, I think it is fine to change it to an abstract class but if people feel strongly against it, we can find another way.

          Show
          Shalin Shekhar Mangar added a comment - Patch adds a boolean "deleteIndex" parameter to core unload action. There is a close hook interface in SolrCore but it is called before the update handler and searcher(s) are closed so it cannot be used to delete the index. Changes: Changes the CloseHook interface to an abstract class with a preClose(SolrCore) and a postClose(SolrCore) method Changed the usage of CloseHook in ReplicationHandler, SolrCoreTest CoreAdminHandler adds a closehook on receiving an unload action with deleteIndex=true Added tests for the new param Since the CloseHook is used very sparingly, I think it is fine to change it to an abstract class but if people feel strongly against it, we can find another way.
          Hide
          Mark Miller added a comment -

          +1

          Show
          Mark Miller added a comment - +1

            People

            • Assignee:
              Shalin Shekhar Mangar
              Reporter:
              Shalin Shekhar Mangar
            • Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development