Solr
  1. Solr
  2. SOLR-725

CoreContainer/CoreDescriptor/SolrCore cleansing

    Details

    • Type: Improvement Improvement
    • Status: Reopened
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 1.3
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      These 3 classes and the name vs alias handling are somewhat confusing.
      The recent SOLR-647 & SOLR-716 have created a bit of a flux.
      This issue attemps to clarify the model and the list of operations.

      CoreDescriptor: describes the parameters of a SolrCore

      Definitions

      • has one name
        • The CoreDescriptor name may represent multiple aliases; in that case, first alias is the SolrCore name
      • has one instance directory location
      • has one config & schema name

      Operations

      The class is only a parameter passing facility

      SolrCore: manages a Lucene index

      Definitions

      • has one unique name (in the CoreContainer)
        • the name is used in JMX to identify the core
      • has one current set of aliases
        • the name is the first alias

      Name & alias operations

      • get name/aliases: obvious
      • alias: adds an alias to this SolrCore
      • unalias: removes an alias from this SolrCore
      • name: sets the SolrCore name
        • potentially impacts JMX registration
      • rename: picks a new name from the SolrCore aliases
        • triggered when alias name is already in use

      CoreContainer: manages all relations between cores & descriptors

      Definitions

      • has a set of aliases (each of them pointing to one core)
        • ensure alias uniqueness.

      SolrCore instance operations

      • load: makes a SolrCore available for requests
        • creates a SolrCore
        • registers all SolrCore aliases in the aliases set
        • (load = create + register)
      • unload: removes a core idenitified by one of its aliases
        • stops handling the Lucene index
        • all SolrCore aliases are removed
      • reload: recreate the core identified by one of its aliases
      • create: create a core from a CoreDescriptor
        • readies up the Lucene index
      • register: registers all aliases of a SolrCore

      SolrCore alias operations

      • swap: swaps 2 aliases
        • method: swap
      • alias: creates 1 alias for a core, potentially unaliasing a previously used alias
        • The SolrCore name being an alias, this operation might trigger a SolrCore rename
      • unalias: removes 1 alias for a core
        • The SolrCore name being an alias, this operation might trigger a SolrCore rename
      • rename: renames a core

      CoreAdminHandler: handles CoreContainer operations

      • load/create: CoreContainer load
      • unload: CoreContainer unload
      • reload: CoreContainer reload
      • swap: CoreContainer swap
      • alias: CoreContainer alias
      • unalias: CoreContainer unalias
      • rename: CoreContainer rename
      • persist: CoreContainer persist, writes the solr.xml
      • stauts: returns the status of all/one SolrCore
      1. solr-725.patch
        47 kB
        Henri Biestro
      2. solr-725.patch
        43 kB
        Henri Biestro
      3. solr-725.patch
        27 kB
        Henri Biestro
      4. solr-725.patch
        22 kB
        Henri Biestro

        Issue Links

          Activity

          Hide
          Erick Erickson added a comment -

          2013 Old JIRA cleanup

          Show
          Erick Erickson added a comment - 2013 Old JIRA cleanup
          Hide
          Hoss Man added a comment -

          Removing fix version since this issue hasn't gotten much attention lately and doesn't appear to be a priority for anyone at the moment.

          As always: if someone wants to take on this work they are welcome to do so at any time and the target release can be revisited

          Show
          Hoss Man added a comment - Removing fix version since this issue hasn't gotten much attention lately and doesn't appear to be a priority for anyone at the moment. As always: if someone wants to take on this work they are welcome to do so at any time and the target release can be revisited
          Hide
          Hoss Man added a comment -

          Bulk of fixVersion=3.6 -> fixVersion=4.0 for issues that have no assignee and have not been updated recently.

          email notification suppressed to prevent mass-spam
          psuedo-unique token identifying these issues: hoss20120321nofix36

          Show
          Hoss Man added a comment - Bulk of fixVersion=3.6 -> fixVersion=4.0 for issues that have no assignee and have not been updated recently. email notification suppressed to prevent mass-spam psuedo-unique token identifying these issues: hoss20120321nofix36
          Hide
          Robert Muir added a comment -

          3.4 -> 3.5

          Show
          Robert Muir added a comment - 3.4 -> 3.5
          Hide
          Robert Muir added a comment -

          Bulk move 3.2 -> 3.3

          Show
          Robert Muir added a comment - Bulk move 3.2 -> 3.3
          Hide
          Hoss Man added a comment -

          Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email...

          http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E

          Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed.

          A unique token for finding these 240 issues in the future: hossversioncleanup20100527

          Show
          Hoss Man added a comment - Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email... http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed. A unique token for finding these 240 issues in the future: hossversioncleanup20100527
          Hide
          Shalin Shekhar Mangar added a comment -

          Marked for 1.5

          Show
          Shalin Shekhar Mangar added a comment - Marked for 1.5
          Hide
          Henri Biestro added a comment -

          updated for trunk 720487

          Show
          Henri Biestro added a comment - updated for trunk 720487
          Hide
          Noble Paul added a comment -

          As is, it does not remove any feature nor forces anyone into using them; thus, it's not breaking anything nor does it make your use-cases more difficult

          I understand your argument.But, when I put in a feature I may have to try to make it work w/ all possible scenarios (ALIAS/RENAME being one of them). Once this is in , it becomes the responsibility of every component writer to do so.

          Allowing more ways to alias a core is an easier path (no pun intended) to this than constraining users into having just one.

          It is better to provide a limited set of features which are consistent than providing all the possible features . That makes life easier for all developers (less, simpler code , less documentation/caveats).

          It is fine to do so if that is a very important feature and there are usecases to support that.

          I can even dedicate a URL to replication that is not something my end-users

          replication is not special. It is a normal request handler so having a dedicated URL is special treatment to one handler. We must not add special privileges unless we have to.

          Show
          Noble Paul added a comment - As is, it does not remove any feature nor forces anyone into using them; thus, it's not breaking anything nor does it make your use-cases more difficult I understand your argument.But, when I put in a feature I may have to try to make it work w/ all possible scenarios (ALIAS/RENAME being one of them). Once this is in , it becomes the responsibility of every component writer to do so. Allowing more ways to alias a core is an easier path (no pun intended) to this than constraining users into having just one. It is better to provide a limited set of features which are consistent than providing all the possible features . That makes life easier for all developers (less, simpler code , less documentation/caveats). It is fine to do so if that is a very important feature and there are usecases to support that. I can even dedicate a URL to replication that is not something my end-users replication is not special. It is a normal request handler so having a dedicated URL is special treatment to one handler. We must not add special privileges unless we have to.
          Hide
          Henri Biestro added a comment -

          Paul
          Would it be fair to say that you fear the alias/hardlink feature would allow users to make configuration/manipulation mistakes more easily wrt replication?

          As is, it does not remove any feature nor forces anyone into using them; thus, it's not breaking anything nor does it make your use-cases more difficult. It might be used in a wrong way and I'm not arguing that since it creates possibility and more choices, it can lead to more mistakes. And in that sense, some users could end up not being able to use the feature you contribute. I do believe though that it's better to describe & educate on best practices than constrain usage.

          I also understand that for solr-727/solr-561, you need some URLs to be stable (which is what the "cool uris dont change" motto advocates and this is a good rule). Allowing more ways to alias a core is an easier path (no pun intended) to this than constraining users into having just one. I can even dedicate a URL to replication that is not something my end-users would ever need to know (since I dont think my deployment constraints or choices should reflect into what they use).

          Aliasing (the hardlink model) is not adverse to replication usage conventions & needs, it instead does allow to respect them more easily with more flexibility.
          Just a different Solr user & contributor opinion.

          Show
          Henri Biestro added a comment - Paul Would it be fair to say that you fear the alias/hardlink feature would allow users to make configuration/manipulation mistakes more easily wrt replication? As is, it does not remove any feature nor forces anyone into using them; thus, it's not breaking anything nor does it make your use-cases more difficult. It might be used in a wrong way and I'm not arguing that since it creates possibility and more choices, it can lead to more mistakes. And in that sense, some users could end up not being able to use the feature you contribute. I do believe though that it's better to describe & educate on best practices than constrain usage. I also understand that for solr-727/solr-561, you need some URLs to be stable (which is what the "cool uris dont change" motto advocates and this is a good rule). Allowing more ways to alias a core is an easier path (no pun intended) to this than constraining users into having just one. I can even dedicate a URL to replication that is not something my end-users would ever need to know (since I dont think my deployment constraints or choices should reflect into what they use). Aliasing (the hardlink model) is not adverse to replication usage conventions & needs, it instead does allow to respect them more easily with more flexibility. Just a different Solr user & contributor opinion.
          Hide
          Noble Paul added a comment -

          We are building a feature as explained in SOLR-727. The idea is to make the replication admin page of master show the details of slaves as well (details like which version of the index is used by each slave etc. etc) instead of going to each slave to know that . So each slave registers itself with the master by providing the url of itself the url will be of this format
          http://<host>:<port>/<web-app>/<corename>/replication.

          to make this feature work we expect the url to be fixed. If the url is a moving target it may just not work.(or it will be difficult)

          Another important feature is SOLR-561 itself . The configuration takes the masterUrl . The url has to be fixed for the lifetime of the solr core.If we make the url invalid this again will not work

          Going forward , we will have to see a Solr as a whole network of systems of multiple masters slaves and multiple shards . unlike the current strategy of seeing each instance as an island. We are making the first step in that direction . Assuming that we will be using urls to communicate with each other it is important to have a reliable/fixed url for each core.

          henri .I am yet to see an argument of why the symlink approach will not meet your usecases other than the point that it is not very 'elegant' . I have made my points on why the alias (in the current form and the way you propose) is going to make my usecases difficult .

          Show
          Noble Paul added a comment - We are building a feature as explained in SOLR-727 . The idea is to make the replication admin page of master show the details of slaves as well (details like which version of the index is used by each slave etc. etc) instead of going to each slave to know that . So each slave registers itself with the master by providing the url of itself the url will be of this format http://<host>:<port>/<web-app>/<corename>/replication. to make this feature work we expect the url to be fixed. If the url is a moving target it may just not work.(or it will be difficult) Another important feature is SOLR-561 itself . The configuration takes the masterUrl . The url has to be fixed for the lifetime of the solr core.If we make the url invalid this again will not work Going forward , we will have to see a Solr as a whole network of systems of multiple masters slaves and multiple shards . unlike the current strategy of seeing each instance as an island. We are making the first step in that direction . Assuming that we will be using urls to communicate with each other it is important to have a reliable/fixed url for each core. henri .I am yet to see an argument of why the symlink approach will not meet your usecases other than the point that it is not very 'elegant' . I have made my points on why the alias (in the current form and the way you propose) is going to make my usecases difficult .
          Hide
          Noble Paul added a comment -

          It's like a rename, but doesn't remove the source.

          Is RENAME really used ? is it useful?

          That's not atomic... requests that come in between can fail.

          Point taken .

          you already can have these problems with swap

          SWAP is a special case . It always ensure that the original name is not gone. The url will remain valid . SWAP is very useful if you wish to test the core before replacing it with an existing core.

          Because Solr is mostly used as a web-app , I feel the url is an important identifier for an asset (till it is removed). It is OK to make it available by another name (I can always have a fixed url with the name , other names can come and go ). But, the asset remaining alive and the url is invalid makes me think of it as a not so desirable feature.

          I dont see how aliases are different than the name itself;

          My proposal wanted to treat them as different. Name is fixed , and aliases are like symlinks. And the core does not even have to be aware of it.

          I am just -0 on the hardlink approach. I just made my points against it

          Show
          Noble Paul added a comment - It's like a rename, but doesn't remove the source. Is RENAME really used ? is it useful? That's not atomic... requests that come in between can fail. Point taken . you already can have these problems with swap SWAP is a special case . It always ensure that the original name is not gone. The url will remain valid . SWAP is very useful if you wish to test the core before replacing it with an existing core. Because Solr is mostly used as a web-app , I feel the url is an important identifier for an asset (till it is removed). It is OK to make it available by another name (I can always have a fixed url with the name , other names can come and go ). But, the asset remaining alive and the url is invalid makes me think of it as a not so desirable feature. I dont see how aliases are different than the name itself; My proposal wanted to treat them as different. Name is fixed , and aliases are like symlinks. And the core does not even have to be aware of it. I am just -0 on the hardlink approach. I just made my points against it
          Hide
          Henri Biestro added a comment - - edited

          added remaining operations and their mappings in CoreAdminRequest;
          added specific tests to check refCount / number of aliases are kept in sync;
          minor modifications so calling close is usually performed outside of the synchronized block on cores (to reduce contention).
          change:
          unloadCore is now a core operation, it unloads the core (removes all its aliases thus really closes/unloads the core);
          unaliasCore is an alias operation and only removes the alias.

          Show
          Henri Biestro added a comment - - edited added remaining operations and their mappings in CoreAdminRequest; added specific tests to check refCount / number of aliases are kept in sync; minor modifications so calling close is usually performed outside of the synchronized block on cores (to reduce contention). change: unloadCore is now a core operation, it unloads the core (removes all its aliases thus really closes/unloads the core); unaliasCore is an alias operation and only removes the alias.
          Hide
          Henri Biestro added a comment -

          updated to include part of solr-731 changes (aka CoreDescriptor.getCoreContainer is not public; ctor does not use a CoreContainer)

          Show
          Henri Biestro added a comment - updated to include part of solr-731 changes (aka CoreDescriptor.getCoreContainer is not public; ctor does not use a CoreContainer)
          Hide
          Henri Biestro added a comment -

          Paul,
          The core always has a name; if you use that name to point to another core, well, that name will point to something else. Whether its JMX or solr-561.
          I dont see how aliases are different than the name itself; you already can have these problems with swap so these are not new ones.

          Show
          Henri Biestro added a comment - Paul, The core always has a name; if you use that name to point to another core, well, that name will point to something else. Whether its JMX or solr-561. I dont see how aliases are different than the name itself; you already can have these problems with swap so these are not new ones.
          Hide
          Yonik Seeley added a comment -

          In reality, why would anyone want to alias to an existing name.

          It's like a rename, but doesn't remove the source.

          We actually delete an existing one and rename to that.

          That's not atomic... requests that come inbetween can fail.
          Rename should be able to overwrite an existing entry (that's independent from the alias issue).

          Just imagine I register a JMX object w/ a name and suddenly it is not more available with that name.

          Isn't that the desired behavior with rename or swap?

          Show
          Yonik Seeley added a comment - In reality, why would anyone want to alias to an existing name. It's like a rename, but doesn't remove the source. We actually delete an existing one and rename to that. That's not atomic... requests that come inbetween can fail. Rename should be able to overwrite an existing entry (that's independent from the alias issue). Just imagine I register a JMX object w/ a name and suddenly it is not more available with that name. Isn't that the desired behavior with rename or swap?
          Hide
          Henri Biestro added a comment - - edited

          A rename with null is a bad usecase . We should not allow it

          rename is a package private method that is called when we unalias a core from its name, asking it to pick a new one from its aliases.
          However, is there is no alias to that core, the core has no way to name itself anymore but it may still be serving requests; thus the 'null' to indicate that case.
          I'd expect this to be rare but the case exists nevertheless.

          Show
          Henri Biestro added a comment - - edited A rename with null is a bad usecase . We should not allow it rename is a package private method that is called when we unalias a core from its name, asking it to pick a new one from its aliases. However, is there is no alias to that core, the core has no way to name itself anymore but it may still be serving requests; thus the 'null' to indicate that case. I'd expect this to be rare but the case exists nevertheless.
          Hide
          Noble Paul added a comment -

          Why not? It's like an atomic rename, except you aren't removing the source. Seems fine to me.

          In reality, why would anyone want to alias to an existing name. It could have been a mistake as well. It is like a rename file to an existing file which is not allowed by an OS. We actually delete an existing one and rename to that.

          Why not? I could see arguments either way on this one.

          The alias is just adding a name to the core . Why should it change the old name?

          OK. why my arguments on these.

          1) Till now we have been assuming that the core always has a name. Just imagine I register a JMX object w/ a name and suddenly it is not more available with that name. It is not a very nice behavior. Actually adding alias to an existing object does not even have to reflect in JMX because there is only one object. It is just a virtual URL to access a core (actually this is the only need).

          2) For instance if I publish the url of a core to the outside and it does not become valid anymore even though the core is alive. (for replication (SOLR-561) we actually fix the url of the master and assume it will always be there). This change may break that assumption

          So, I am afraid that we are suddenly changing a very well known behavior . And the symlink approach is my middle path to keep old things as it is and make the feature available . The symlink approach may not be the very 'ideal' solution , but it somehow struck to me as the most practical solution. That is it

          Show
          Noble Paul added a comment - Why not? It's like an atomic rename, except you aren't removing the source. Seems fine to me. In reality, why would anyone want to alias to an existing name. It could have been a mistake as well. It is like a rename file to an existing file which is not allowed by an OS. We actually delete an existing one and rename to that. Why not? I could see arguments either way on this one. The alias is just adding a name to the core . Why should it change the old name? OK. why my arguments on these. 1) Till now we have been assuming that the core always has a name. Just imagine I register a JMX object w/ a name and suddenly it is not more available with that name. It is not a very nice behavior. Actually adding alias to an existing object does not even have to reflect in JMX because there is only one object. It is just a virtual URL to access a core (actually this is the only need). 2) For instance if I publish the url of a core to the outside and it does not become valid anymore even though the core is alive. (for replication ( SOLR-561 ) we actually fix the url of the master and assume it will always be there). This change may break that assumption So, I am afraid that we are suddenly changing a very well known behavior . And the symlink approach is my middle path to keep old things as it is and make the feature available . The symlink approach may not be the very 'ideal' solution , but it somehow struck to me as the most practical solution. That is it
          Hide
          Yonik Seeley added a comment -

          aliases vs names... they were all "names" in my head. No real difference as far as CoreContainer is concerned... it was just the "alias" command (could have been called addNewName, but alias was shorter)
          I didn't put a lot of thought into a core being mapped to multiple names by CoreContainer... it just seemed natural. If people want to dump it, I won't object too loudly. I've marked the "alias" command as experimental in the wiki.

          An alias should not be allowed if the name already exists

          Why not? It's like an atomic rename, except you aren't removing the source. Seems fine to me.

          A rename with null is a bad usecase . We should not allow it

          Agree, since it seems like it would be a user bug.

          An ALIAS must not rename a core. It should just add another mapping in the core container. The only command that should change a core's name should be SWAP

          Why not? I could see arguments either way on this one.

          UNALIAS command can be added . It can just remove an ALIAS if it exists . But it must not be able to remove the primary name (use UNLOAD to do that).

          Sounds complex... why the difference?

          Show
          Yonik Seeley added a comment - aliases vs names... they were all "names" in my head. No real difference as far as CoreContainer is concerned... it was just the "alias" command (could have been called addNewName, but alias was shorter) I didn't put a lot of thought into a core being mapped to multiple names by CoreContainer... it just seemed natural. If people want to dump it, I won't object too loudly. I've marked the "alias" command as experimental in the wiki. An alias should not be allowed if the name already exists Why not? It's like an atomic rename, except you aren't removing the source. Seems fine to me. A rename with null is a bad usecase . We should not allow it Agree, since it seems like it would be a user bug. An ALIAS must not rename a core. It should just add another mapping in the core container. The only command that should change a core's name should be SWAP Why not? I could see arguments either way on this one. UNALIAS command can be added . It can just remove an ALIAS if it exists . But it must not be able to remove the primary name (use UNLOAD to do that). Sounds complex... why the difference?
          Hide
          Henri Biestro added a comment -

          The rename with null can only occur if the core name is going to be used out by another core alias.
          Fixing point 1. moots point 2.
          I still don't get the compelling reason to switch to the symlink model but besides being different, there is no compelling advantage to the hardlink model that I can see either.
          So, assuming symlink would be the choice, the only added constraints to the model are that alias/unalias can not operate or modify a core name.
          Besides Paul & I, any comment ?

          Show
          Henri Biestro added a comment - The rename with null can only occur if the core name is going to be used out by another core alias . Fixing point 1. moots point 2. I still don't get the compelling reason to switch to the symlink model but besides being different, there is no compelling advantage to the hardlink model that I can see either. So, assuming symlink would be the choice, the only added constraints to the model are that alias/unalias can not operate or modify a core name . Besides Paul & I, any comment ?
          Hide
          Noble Paul added a comment -

          I looked at your patch and I'm mostly fine.

          1. An alias should not be allowed if the name already exists . We do not have to do it
          2. A rename with null is a bad usecase . We should not allow it
          Show
          Noble Paul added a comment - I looked at your patch and I'm mostly fine. An alias should not be allowed if the name already exists . We do not have to do it A rename with null is a bad usecase . We should not allow it
          Hide
          Henri Biestro added a comment -

          Paul-
          We are not removing any of these methods but we do need to clarify CoreDescriptor usage.
          CoreDescriptor.getName() is not used besides when loading a SolrCore which is exactly what it should solely be used for ; CoreDescriptor is only a SolrCore creation parameter, using it for any other purpose is not a good idea.
          When you need the SolrCore name (not all its aliases), you should use SolrCore.getName().
          The SolrCore name is merely the first alias and the one used to identify the SolrCore to JMX.
          The hard-link model is implemented with this patch and avoids the functional complexity of added complexity of 'alias/unalias' should not rename, etc; although I don't doubt it's easy to implement the sym-link model, imho we are still missing a compelling reason to do so.

          Show
          Henri Biestro added a comment - Paul- We are not removing any of these methods but we do need to clarify CoreDescriptor usage. CoreDescriptor.getName() is not used besides when loading a SolrCore which is exactly what it should solely be used for ; CoreDescriptor is only a SolrCore creation parameter, using it for any other purpose is not a good idea. When you need the SolrCore name (not all its aliases ), you should use SolrCore.getName(). The SolrCore name is merely the first alias and the one used to identify the SolrCore to JMX. The hard-link model is implemented with this patch and avoids the functional complexity of added complexity of 'alias/unalias' should not rename, etc; although I don't doubt it's easy to implement the sym-link model, imho we are still missing a compelling reason to do so.
          Hide
          Noble Paul added a comment - - edited

          Which ones? Is it something that impacts solr-561 somehow?

          Sorry for not being clear. The widely used methods getName() in SolrCore and CoreDescriptor is what I am referring to .

          The symlink method is least invasive . The reason being , getName() is far more useful than the alias feature itself (think of JMX cores are identified by name. and there are more places ) . Using this approach we get all the benefits of alias and we lose nothing.
          Moreover the implementation is easy

          Show
          Noble Paul added a comment - - edited Which ones? Is it something that impacts solr-561 somehow? Sorry for not being clear. The widely used methods getName() in SolrCore and CoreDescriptor is what I am referring to . The symlink method is least invasive . The reason being , getName() is far more useful than the alias feature itself (think of JMX cores are identified by name. and there are more places ) . Using this approach we get all the benefits of alias and we lose nothing. Moreover the implementation is easy
          Hide
          Shalin Shekhar Mangar added a comment -

          We decref the counter appropriately (registerNotSynchronized does not do it)

          Ah right. I forgot that the core is ref counted. I saw the call to close and assumed it will be closed immediately. Sorry about that.

          Show
          Shalin Shekhar Mangar added a comment - We decref the counter appropriately (registerNotSynchronized does not do it) Ah right. I forgot that the core is ref counted. I saw the call to close and assumed it will be closed immediately. Sorry about that.
          Hide
          Henri Biestro added a comment -

          Shalin -
          We decref the counter appropriately (registerNotSynchronized does not do it) or I just dont see the issue so I 'll assume that what bothers you is that unaliasing a core should not have the potential effect of closing it.
          If we use Yonik's model, there is no reason not to; the model is akin to the inode / hard-link model on Unix. The inode in our case is the Lucene index (the dataDir, managed by the SolrCore) - and the aliases are the hardlinks. If you remove all links (unalias all aliases), the inode (of the Lucene index) goes away. Even if the unlink (unalias) is the side effect of another 'ln' (alias).

          A symbolic link model - as Paul proposes - is a different one.
          I don't get yet what the hard-link/Yonik's model problem is which I'd find useful to understand .

          Paul -

          But as things stand we are removing some commonly useful methods

          Which ones? Is it something that impacts solr-561 somehow?
          Btw, your example started using the alias feature ,( not the alias command), creating a core with name 'version-3.0' and alias 'dev' (*1-create('version-3.0,dev') *); thus the difficulty to get your previous point.

          Show
          Henri Biestro added a comment - Shalin - We decref the counter appropriately (registerNotSynchronized does not do it) or I just dont see the issue so I 'll assume that what bothers you is that unaliasing a core should not have the potential effect of closing it. If we use Yonik's model, there is no reason not to; the model is akin to the inode / hard-link model on Unix. The inode in our case is the Lucene index (the dataDir, managed by the SolrCore) - and the aliases are the hardlinks. If you remove all links (unalias all aliases), the inode (of the Lucene index) goes away. Even if the unlink (unalias) is the side effect of another 'ln' (alias). A symbolic link model - as Paul proposes - is a different one. I don't get yet what the hard-link/Yonik's model problem is which I'd find useful to understand . Paul - But as things stand we are removing some commonly useful methods Which ones? Is it something that impacts solr-561 somehow? Btw, your example started using the alias feature ,( not the alias command ), creating a core with name 'version-3.0' and alias 'dev' (*1-create('version-3.0,dev') *); thus the difficulty to get your previous point.
          Hide
          Noble Paul added a comment - - edited

          Paul - I haven't looked at Henri's patch, but like Henri I also don't follow your logic. You give an example of using core alias

          My example uses SWAP . SWAP is a indeed a useful feature and SWAP does not use ALIAS . The usecase is this. I wish to start core and ensure that it is initialized properly . If it does I wish to replace that with another core .

          My concern here ,

          • We have added a feature called ALIAS
          • Whose usecases are not ver clear
          • Because of this feature some very useful methods are implemented inconsistently. As Yonik says "core should be independent of how it is named" . But as things stand we are removing some commonly useful methods

          OK. Now that we already have ALIAS as a feature I propose the following behavior ,

          • let the getName() methods remain as is.
          • An ALIAS must not rename a core. It should just add another mapping in the core container. The only command that should change a core's name should be SWAP
          • An ALIAS command must not succeed if the new name is already registered for another core. If a user wish to do so UNLOAD that core , or if it is an alias UNALIAS that name before trying this.
          • The solr.xml <core> tag must keep the name as the primary name. We can add an extra attribute 'alias' which can take multiple names. This is already discussed in SOLR-350.
          • UNALIAS command can be added . It can just remove an ALIAS if it exists . But it must not be able to remove the primary name (use UNLOAD to do that).
          • SolrQueryRequest should have a method to let handlers know through which alias this request is made
          Show
          Noble Paul added a comment - - edited Paul - I haven't looked at Henri's patch, but like Henri I also don't follow your logic. You give an example of using core alias My example uses SWAP . SWAP is a indeed a useful feature and SWAP does not use ALIAS . The usecase is this. I wish to start core and ensure that it is initialized properly . If it does I wish to replace that with another core . My concern here , We have added a feature called ALIAS Whose usecases are not ver clear Because of this feature some very useful methods are implemented inconsistently. As Yonik says "core should be independent of how it is named" . But as things stand we are removing some commonly useful methods OK. Now that we already have ALIAS as a feature I propose the following behavior , let the getName() methods remain as is. An ALIAS must not rename a core. It should just add another mapping in the core container. The only command that should change a core's name should be SWAP An ALIAS command must not succeed if the new name is already registered for another core. If a user wish to do so UNLOAD that core , or if it is an alias UNALIAS that name before trying this. The solr.xml <core> tag must keep the name as the primary name. We can add an extra attribute 'alias' which can take multiple names. This is already discussed in SOLR-350 . UNALIAS command can be added . It can just remove an ALIAS if it exists . But it must not be able to remove the primary name (use UNLOAD to do that). SolrQueryRequest should have a method to let handlers know through which alias this request is made
          Hide
          Shalin Shekhar Mangar added a comment -

          It seems that calling Alias may lead to closing of the old core in the CoreContainer#register method. The old core is closed even if it is aliased to some other name. This is a very dangerous side-effect of Alias and must be remedied.

          Show
          Shalin Shekhar Mangar added a comment - It seems that calling Alias may lead to closing of the old core in the CoreContainer#register method. The old core is closed even if it is aliased to some other name. This is a very dangerous side-effect of Alias and must be remedied.
          Hide
          Henri Biestro added a comment -

          Otis, this does not hold releasing 1.3; I'm glad solr-724 got solved.
          I just wish it will be easier (& faster) to release 1.3.1.
          Shalin, I'll do my best about solr-723; but again, this should not hold releasing 1.3.

          Show
          Henri Biestro added a comment - Otis, this does not hold releasing 1.3; I'm glad solr-724 got solved. I just wish it will be easier (& faster) to release 1.3.1. Shalin, I'll do my best about solr-723; but again, this should not hold releasing 1.3.
          Hide
          Yonik Seeley added a comment -

          According to me the alias feature is implemented in a very wrong way.Because of that some commonly used methods have no consistency SolrCore#getName(), CoreDescriptor#getName() etc .

          Another way to think about it is that a core should be independent of how it is named or accessed (via HTTP, etc)... it has no inherent name, but CoreContainer is a way of allowing access by name, and SolrDispatchFilter allows access by name over HTTP (via CoreContainer's names). So in this mental model, it's SolrCore#getName() and CoreDescriptor#getName() that don't make sense.

          Show
          Yonik Seeley added a comment - According to me the alias feature is implemented in a very wrong way.Because of that some commonly used methods have no consistency SolrCore#getName(), CoreDescriptor#getName() etc . Another way to think about it is that a core should be independent of how it is named or accessed (via HTTP, etc)... it has no inherent name, but CoreContainer is a way of allowing access by name, and SolrDispatchFilter allows access by name over HTTP (via CoreContainer's names). So in this mental model, it's SolrCore#getName() and CoreDescriptor#getName() that don't make sense.
          Hide
          Shalin Shekhar Mangar added a comment -

          Henri – Can you please attach the relevant parts of this patch to SOLR-723 to fix the JMX issues?

          Show
          Shalin Shekhar Mangar added a comment - Henri – Can you please attach the relevant parts of this patch to SOLR-723 to fix the JMX issues?
          Hide
          Otis Gospodnetic added a comment -

          Paul - I haven't looked at Henri's patch, but like Henri I also don't follow your logic. You give an example of using core alias (using swap), but then say you don't see a use case for it. Could you please explain?

          Also, thank you Henri for pushing this forward. I haven't paid very close attention to all the new Core classes and couldn't tell you which one does what without studying the code and reading a pile of JIRA comments.

          Do you think this can wait to be committed after 1.3 is out (i.e. no need to stop working on it, just don't commit so we don't delay 1.3 more).

          Show
          Otis Gospodnetic added a comment - Paul - I haven't looked at Henri's patch, but like Henri I also don't follow your logic. You give an example of using core alias (using swap), but then say you don't see a use case for it. Could you please explain? Also, thank you Henri for pushing this forward. I haven't paid very close attention to all the new Core classes and couldn't tell you which one does what without studying the code and reading a pile of JIRA comments. Do you think this can wait to be committed after 1.3 is out (i.e. no need to stop working on it, just don't commit so we don't delay 1.3 more).
          Hide
          Henri Biestro added a comment -

          bq . This does not look like a useful way of using aliases. This is one extra step which could have been avoided.

          You used aliases in your own example. So I must have missed your point.

          We just blindly do a reload assuming everything is fine. So no testing.

          Your operational rules are different than those I'm constrained to.
          I'm merely trying to contribute back the solution to some problems I've encountered.

          bq . the alias feature is implemented in a very wrong way

          Again, this is what this issue attempts to address.
          It's not intended to be confrontational.

          I am yet to see a valid usecase.

          Besides those already mentioned (& I guess Yonik may have more since he introduced aliasessolving solr-647), there are plenty of other features that can come out from having a different path to use the same core; security, rendering, etc.

          Courtesy aside, I do respect functional needs of others & their implications although I don't understand all of them; I wish this was a community value.

          Show
          Henri Biestro added a comment - bq . This does not look like a useful way of using aliases. This is one extra step which could have been avoided. You used aliases in your own example. So I must have missed your point. We just blindly do a reload assuming everything is fine. So no testing. Your operational rules are different than those I'm constrained to. I'm merely trying to contribute back the solution to some problems I've encountered. bq . the alias feature is implemented in a very wrong way Again, this is what this issue attempts to address. It's not intended to be confrontational. I am yet to see a valid usecase. Besides those already mentioned (& I guess Yonik may have more since he introduced aliasessolving solr-647), there are plenty of other features that can come out from having a different path to use the same core; security, rendering, etc. Courtesy aside, I do respect functional needs of others & their implications although I don't understand all of them; I wish this was a community value.
          Hide
          Noble Paul added a comment -

          What we do is after we make the necessary changes do a reload.

          We just blindly do a reload assuming everything is fine. So no testing.

          Correct, this is another useful way of using aliases to achieve the same.

          This does not look like a useful way of using aliases. This is one extra step which could have been avoided

          According to me the alias feature is implemented in a very wrong way.Because of that some commonly used methods have no consistency SolrCore#getName(), CoreDescriptor#getName() etc .

          Moreover I am yet to see a valid usecase. I just wonder why it is there

          Show
          Noble Paul added a comment - What we do is after we make the necessary changes do a reload. We just blindly do a reload assuming everything is fine. So no testing. Correct, this is another useful way of using aliases to achieve the same. This does not look like a useful way of using aliases. This is one extra step which could have been avoided According to me the alias feature is implemented in a very wrong way.Because of that some commonly used methods have no consistency SolrCore#getName(), CoreDescriptor#getName() etc . Moreover I am yet to see a valid usecase. I just wonder why it is there
          Hide
          Henri Biestro added a comment -

          What we do is after we make the necessary changes do a reload.

          You lost me here; reindexing implies you loaded the core already. Guess you mean that after reindexing, you can replicate & slaves only have to reload?

          is it not possible with the following steps?

          Correct, this is another useful way of using aliases to achieve the same.

          Show
          Henri Biestro added a comment - What we do is after we make the necessary changes do a reload. You lost me here; reindexing implies you loaded the core already. Guess you mean that after reindexing, you can replicate & slaves only have to reload? is it not possible with the following steps? Correct, this is another useful way of using aliases to achieve the same.
          Hide
          Noble Paul added a comment -

          When you make schema/indexing changes that necessitate reindexing (hcnage stopwords, stemming, etc):

          We are already doing this in SOLR-561 . What we do is after we make the necessary changes do a reload.

          I see your point of verifying the core .

          Why is it not possible with the following steps?

          1. create('version-3.0,dev')
          2. reindex the content
          3. verify your preferred queries do work appropriately
          4. swap ('public', 'versions-3.0')
          5. unload('versions-3.0')
          Show
          Noble Paul added a comment - When you make schema/indexing changes that necessitate reindexing (hcnage stopwords, stemming, etc): We are already doing this in SOLR-561 . What we do is after we make the necessary changes do a reload. I see your point of verifying the core . Why is it not possible with the following steps? create('version-3.0,dev') reindex the content verify your preferred queries do work appropriately swap ('public', 'versions-3.0') unload('versions-3.0')
          Hide
          Henri Biestro added a comment -

          About being inconsistent, this is what this issue attempts to solve.

          And, yes, aliasing is a usefull feature: this allows to have one webapp path that's constant for users (or links to persist) and allows to change the index when reindexing is needed (reload is only good enough for non-schema related modifications) without fuss..

          Say you have your core declared as:

          <core name='version-2.1,public' .../>
          

          Users refer to it through its public alias.
          When you make schema/indexing changes that necessitate reindexing (hcnage stopwords, stemming, etc):

          1. create your new core as 'version-3.0,dev'
          2. reindex the content
          3. verify your preferred queries do work appropriately
          4. alias('public', 'versions-3.0') ; which changes the link to point to the new version and closes the old one (as soon as all queries have finished running)
          5. unalias('dev'); so you are ready for next version
          Show
          Henri Biestro added a comment - About being inconsistent, this is what this issue attempts to solve. And, yes, aliasing is a usefull feature: this allows to have one webapp path that's constant for users (or links to persist) and allows to change the index when reindexing is needed (reload is only good enough for non-schema related modifications) without fuss.. Say you have your core declared as: <core name='version-2.1,public' .../> Users refer to it through its public alias. When you make schema/indexing changes that necessitate reindexing (hcnage stopwords, stemming, etc): create your new core as 'version-3.0,dev' reindex the content verify your preferred queries do work appropriately alias('public', 'versions-3.0') ; which changes the link to point to the new version and closes the old one (as soon as all queries have finished running) unalias('dev'); so you are ready for next version
          Hide
          Noble Paul added a comment -

          I wish to know a few things.

          • Is anybody using the alias feature?
          • If yes , what is the usecase?

          The alias feature implementation confuses me and the behavior seems to be very inconsistent.

          Show
          Noble Paul added a comment - I wish to know a few things. Is anybody using the alias feature? If yes , what is the usecase? The alias feature implementation confuses me and the behavior seems to be very inconsistent.
          Hide
          Henri Biestro added a comment -

          The patch solves the 3 issues this relates to.

          The SolrCore handles its name & makes JMX follow potential renames.
          CoreDescriptor does not expose more than necessary (& protects the possibility of SOLR-646 property expression features).
          CoreContainer is refactored to be a bit more efficient wrt cores locking & SolrCore alias/name handling.

          Show
          Henri Biestro added a comment - The patch solves the 3 issues this relates to. The SolrCore handles its name & makes JMX follow potential renames. CoreDescriptor does not expose more than necessary (& protects the possibility of SOLR-646 property expression features). CoreContainer is refactored to be a bit more efficient wrt cores locking & SolrCore alias/name handling.

            People

            • Assignee:
              Unassigned
              Reporter:
              Henri Biestro
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:

                Development