Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.3
    • Fix Version/s: 1.3
    • Component/s: None
    • Labels:
      None

      Description

      In SOLR-215, we enabled support for more then one SolrCore - but there is no way to use them yet.

      We need to make some interface to manage, register, modify avaliable SolrCores

      1. solr-350.patch
        26 kB
        Henri Biestro
      2. solr-350.patch
        29 kB
        Henri Biestro
      3. solr-350.patch
        86 kB
        Henri Biestro
      4. solr-350.patch
        83 kB
        Henri Biestro
      5. solr-350.patch
        69 kB
        Ryan McKinley
      6. solr-350.patch
        67 kB
        Ryan McKinley
      7. solr-350.patch
        69 kB
        Ryan McKinley
      8. solr-350.patch
        61 kB
        Henri Biestro
      9. solr-350.patch
        67 kB
        Henri Biestro
      10. solr-350.patch
        61 kB
        Henri Biestro
      11. solr-350.patch
        56 kB
        Henri Biestro
      12. solr-350.patch
        54 kB
        Henri Biestro
      13. solr-350.patch
        5 kB
        Henri Biestro
      14. solr-350.patch
        1 kB
        Henri Biestro
      15. SOLR-350-jsp-fixes.patch
        8 kB
        Shalin Shekhar Mangar
      16. SOLR-350-jsp-fixes.patch
        4 kB
        Shalin Shekhar Mangar
      17. SOLR-350-MultiCore.patch
        38 kB
        Ryan McKinley
      18. SOLR-350-MultiCore.patch
        12 kB
        Ryan McKinley
      19. SOLR-350-MultiCore.patch
        38 kB
        Ryan McKinley
      20. SOLR-350-Naming.patch
        51 kB
        Ryan McKinley
      21. SOLR-350-Naming.patch
        40 kB
        Ryan McKinley
      22. solr-350-properties.patch
        38 kB
        Henri Biestro
      23. SOLR-350-RemoveStatic.patch
        6 kB
        Ryan McKinley

        Issue Links

          Activity

          Hide
          Ryan McKinley added a comment -

          Here is a quick sketch of what I think the multicore management/interface should look like.

          Essentially, it works like this:

          A. If you do nothing, solr keeps working as it is - it has a little extra checking at startup and each requests only makes an extra if( singlecore != null ) call

          B. If you put a "multicore.xml" file in the startup instanceDir, a multicore registry will be initialized. Each call to the SolrDispatchFilter will select the core (from a synchronized map). Using the default core does not require a synchronized map lookup.

          In the attached patch, you select the core from the path:

          http://host:port/context/handlerpath – uses default core
          http://host:port/context/@core0/handlerpath – uses core0
          http://host:port/context/@core1/handlerpath – uses core1

          This assumes handler names will not start with '@' (perhaps we should make it a requirement that handler names don't start with any punctuation? this would leave open special characters in the future?)

          This still needs a servlet or request handler to manage core manipulation (load, restart, etc). Since it handles functions across handlers, it should probably be a servlet, but that makes it difficult to use the wt=json/xml stuff.

          NOTE – the core management stuff is untested, I'm attaching it now because I don't have much time to work on it and hopefully someone else can carry on.

          Parts of this patch clean up things from SOLR-215. Unless there is much movement on this issue, I'd like to commit that part in a few days.

          Show
          Ryan McKinley added a comment - Here is a quick sketch of what I think the multicore management/interface should look like. Essentially, it works like this: A. If you do nothing, solr keeps working as it is - it has a little extra checking at startup and each requests only makes an extra if( singlecore != null ) call B. If you put a "multicore.xml" file in the startup instanceDir, a multicore registry will be initialized. Each call to the SolrDispatchFilter will select the core (from a synchronized map). Using the default core does not require a synchronized map lookup. In the attached patch, you select the core from the path: http://host:port/context/handlerpath – uses default core http://host:port/context/@core0/handlerpath – uses core0 http://host:port/context/@core1/handlerpath – uses core1 This assumes handler names will not start with '@' (perhaps we should make it a requirement that handler names don't start with any punctuation? this would leave open special characters in the future?) This still needs a servlet or request handler to manage core manipulation (load, restart, etc). Since it handles functions across handlers, it should probably be a servlet, but that makes it difficult to use the wt=json/xml stuff. NOTE – the core management stuff is untested, I'm attaching it now because I don't have much time to work on it and hopefully someone else can carry on. Parts of this patch clean up things from SOLR-215 . Unless there is much movement on this issue, I'd like to commit that part in a few days.
          Hide
          Yonik Seeley added a comment -

          I assume core management stuff needs to be persistent.... if you add a core via the REST api, and the server restarts, you want it to still be there. So should multicore.xml be changed and written back in this case?

          Show
          Yonik Seeley added a comment - I assume core management stuff needs to be persistent.... if you add a core via the REST api, and the server restarts, you want it to still be there. So should multicore.xml be changed and written back in this case?
          Hide
          Ryan McKinley added a comment -

          Yes, persistence seems like a good option.

          For the case where you are updating a live schema it may not make sense though.
          tempCore = load new core
          defaultCore = tempCore
          (close old core when all requests have finished)

          Show
          Ryan McKinley added a comment - Yes, persistence seems like a good option. For the case where you are updating a live schema it may not make sense though. tempCore = load new core defaultCore = tempCore (close old core when all requests have finished)
          Hide
          Walter Ferrara added a comment -

          In my system (netbeans5.5/java1.6 on winxp), it seems to me that it look for multicore.xml in 2 places, both in solr/multicore.xml and in solr/conf/multicore.xml. (using the example dir, Multicore look for multicore.xml in solr/, while Dispatcher in solr/conf)

          In MultiCore.java, a getCores()

          { return cores.keySet(); }

          would allow to retrieve all the cores registered in the server.
          This would allow an handler, for example, to dynamically retrieve all the cores (at least by their names) currently registered (SOLR-215 had this).

          How replication will work with multicores? Every core will have different bin dirs (allowing different settings for each one), or replication binaries will replicate all cores (making replication much easier)?

          Hope this patch get committed soon.
          Have a nice day.

          Show
          Walter Ferrara added a comment - In my system (netbeans5.5/java1.6 on winxp), it seems to me that it look for multicore.xml in 2 places, both in solr/multicore.xml and in solr/conf/multicore.xml. (using the example dir, Multicore look for multicore.xml in solr/, while Dispatcher in solr/conf) In MultiCore.java, a getCores() { return cores.keySet(); } would allow to retrieve all the cores registered in the server. This would allow an handler, for example, to dynamically retrieve all the cores (at least by their names) currently registered ( SOLR-215 had this). How replication will work with multicores? Every core will have different bin dirs (allowing different settings for each one), or replication binaries will replicate all cores (making replication much easier)? Hope this patch get committed soon. Have a nice day.
          Hide
          Stu Hood added a comment -

          I feel like the suggested implementation is a re-imagining of the Tomcat Manager REST api (http://tomcat.apache.org/tomcat-6.0-doc/manager-howto.html). The main reason I like the idea of multiple cores in the same instance is to provide tighter integration between them: more like a conventional relational database, with multiple tables that have independent schemas (where Solr core == SQL table). Otherwise, having your servlet container managing the contexts just makes more sense, since that is what it is built for.

          Also, I think the core should be a parameter of the query, so that there is the possibility of querying multiple cores simultaneously. Having a top-level controller managing dispatch to the cores opens up all kinds of possibilities for future expansion, (such as joins between indexes?) and it would make things like federated search much more elegant. SOLR-303 already has a "shards" parameter with the same idea behind it: just prefix local cores with the @ symbol, and you are good to go.

          Loving the potential here!

          Show
          Stu Hood added a comment - I feel like the suggested implementation is a re-imagining of the Tomcat Manager REST api ( http://tomcat.apache.org/tomcat-6.0-doc/manager-howto.html ). The main reason I like the idea of multiple cores in the same instance is to provide tighter integration between them: more like a conventional relational database, with multiple tables that have independent schemas (where Solr core == SQL table). Otherwise, having your servlet container managing the contexts just makes more sense, since that is what it is built for. Also, I think the core should be a parameter of the query, so that there is the possibility of querying multiple cores simultaneously. Having a top-level controller managing dispatch to the cores opens up all kinds of possibilities for future expansion, (such as joins between indexes?) and it would make things like federated search much more elegant. SOLR-303 already has a "shards" parameter with the same idea behind it: just prefix local cores with the @ symbol, and you are good to go. Loving the potential here!
          Hide
          Ryan McKinley added a comment -

          Updated patch to work with trunk – in rev 578507, I added the core changes to trunk so this patch can focus on the multicore interface.

          Stu - I like the idea of looking to the existing API for guidance. That seems smart.

          Again, I'm not working on this actively, but want to make sure it is easy for someone to pick up.

          Show
          Ryan McKinley added a comment - Updated patch to work with trunk – in rev 578507, I added the core changes to trunk so this patch can focus on the multicore interface. Stu - I like the idea of looking to the existing API for guidance. That seems smart. Again, I'm not working on this actively, but want to make sure it is easy for someone to pick up.
          Hide
          Ryan McKinley added a comment -

          more real example.

          This looks for a 'multicore.xml' in the instancedir and registers different cores if it is present...

          Show
          Ryan McKinley added a comment - more real example. This looks for a 'multicore.xml' in the instancedir and registers different cores if it is present...
          Hide
          Doug Steigerwald added a comment -

          Any chance there's going to be support to view the admin interface for each core? Doesn't seem like it's possible currently.

          Also, the admin interface you do see is for the last core loaded and not the default core in the configuration.

          Show
          Doug Steigerwald added a comment - Any chance there's going to be support to view the admin interface for each core? Doesn't seem like it's possible currently. Also, the admin interface you do see is for the last core loaded and not the default core in the configuration.
          Hide
          Henri Biestro added a comment -

          Ryan - Should solr-409 (aka class loader sharing) become a dependency of this issue and if so what kind of "link" should be used to refer to it?
          Or should I fold solr-409 in solr-350 (closing solr-409 in the process)? The "new" behavior does not break the current one.

          Show
          Henri Biestro added a comment - Ryan - Should solr-409 (aka class loader sharing) become a dependency of this issue and if so what kind of "link" should be used to refer to it? Or should I fold solr-409 in solr-350 (closing solr-409 in the process)? The "new" behavior does not break the current one.
          Hide
          Ryan McKinley added a comment -

          For simplicity, I think adding 409 to 350 is a good idea. I have not looked at 409 yet, but I like Walters suggestion to optionally have a single shared lib across all cores (rather then making each lib dir optionally shared)

          <multicore enabled="true" adminpath="/admin/multicore" persistent="true" sharedLibDir="lib">
          <core name="core0" instanceDir="core0" default="true"/>
          <core name="core1" instanceDir="core1" />
          </multicore>

          Show
          Ryan McKinley added a comment - For simplicity, I think adding 409 to 350 is a good idea. I have not looked at 409 yet, but I like Walters suggestion to optionally have a single shared lib across all cores (rather then making each lib dir optionally shared) <multicore enabled="true" adminpath="/admin/multicore" persistent="true" sharedLibDir="lib"> <core name="core0" instanceDir="core0" default="true"/> <core name="core1" instanceDir="core1" /> </multicore>
          Hide
          Henri Biestro added a comment -

          Walter's suggestion is already in solr-409 (with libDir attribute name).

          I could not verify everything and wanted to be safe so I loaded an updated version of solr-350_409.patch in solr-409.
          There are some improvements in the admin webapp that is now multi core aware. (ie: you can switch from core to core).
          I also made a small change in Config.java; locateInstanceDir seems to look for sol.solr.home as an environment variable.

          I've quickly checked the deployment against the example starting with: java -Dsolr.home=`pwd`/multicore -jar start.jar .
          As soon as I'm more confident, I'll push the patch over solr-350.

          Show
          Henri Biestro added a comment - Walter's suggestion is already in solr-409 (with libDir attribute name). I could not verify everything and wanted to be safe so I loaded an updated version of solr-350_409.patch in solr-409. There are some improvements in the admin webapp that is now multi core aware. (ie: you can switch from core to core). I also made a small change in Config.java; locateInstanceDir seems to look for sol.solr.home as an environment variable. I've quickly checked the deployment against the example starting with: java -Dsolr.home=`pwd`/multicore -jar start.jar . As soon as I'm more confident, I'll push the patch over solr-350.
          Hide
          Henri Biestro added a comment -

          core can be set as a request parameter ( ?core=corename versus /@corename)

          Show
          Henri Biestro added a comment - core can be set as a request parameter ( ?core=corename versus /@corename)
          Hide
          Henri Biestro added a comment -

          use a request attribute to pass the core in all pages

          Show
          Henri Biestro added a comment - use a request attribute to pass the core in all pages
          Hide
          Ryan McKinley added a comment -

          patch to get rid of the @corename syntax and force things into /corename/handler

          Adds 'RENAME' action to rename a core –

          see:
          http://www.nabble.com/purpose-of-MultiCore--22default-22---to14268755.html
          http://www.nabble.com/multicore-and-admin-pages--to14268867.html

          This also got rid of the ?core=name syntax for /admin, and makes it work for:
          /corename/admin/xxx.jsp

          Show
          Ryan McKinley added a comment - patch to get rid of the @corename syntax and force things into /corename/handler Adds 'RENAME' action to rename a core – see: http://www.nabble.com/purpose-of-MultiCore--22default-22---to14268755.html http://www.nabble.com/multicore-and-admin-pages--to14268867.html This also got rid of the ?core=name syntax for /admin, and makes it work for: /corename/admin/xxx.jsp
          Hide
          Henri Biestro added a comment -

          updated to implement 'alias' (should be considered draft since there aren't specific unit tests associated yet);
          implements persistence (added an XmlWriter that might be revisited); all operations that modify the multicore state will rewrite the multicore.xml

          Alias feature:
          0 - Name and aliases reside in a common identifier space; one identifier uniquely determines a core (can't have the identifier 'core' used a a name to point to coreA and as an alias to point to coreB)
          1 - One core has one unique immutable name (rename command has been neutralized)
          2 - One core may have many aliases
          3 - There are only 2 admin commands related to aliases:
          3.1 - alias(core, alias): adds an alias to a core, overriding any existing alias but fails to override a core name.
          3.2 - unalias(str); if str is a core name identifier, all its aliases get deleted, if str is an alias identifer only that alias gets deleted.
          4 - Core addressing through URLs/API can use either name or alias (although using alias is best practice for common aka non-admin operations)

          Show
          Henri Biestro added a comment - updated to implement 'alias' (should be considered draft since there aren't specific unit tests associated yet); implements persistence (added an XmlWriter that might be revisited); all operations that modify the multicore state will rewrite the multicore.xml Alias feature: 0 - Name and aliases reside in a common identifier space; one identifier uniquely determines a core (can't have the identifier 'core' used a a name to point to coreA and as an alias to point to coreB) 1 - One core has one unique immutable name (rename command has been neutralized) 2 - One core may have many aliases 3 - There are only 2 admin commands related to aliases: 3.1 - alias(core, alias): adds an alias to a core, overriding any existing alias but fails to override a core name. 3.2 - unalias(str); if str is a core name identifier, all its aliases get deleted, if str is an alias identifer only that alias gets deleted. 4 - Core addressing through URLs/API can use either name or alias (although using alias is best practice for common aka non-admin operations)
          Hide
          Henri Biestro added a comment -

          simplified code; prepared for createCore

          Show
          Henri Biestro added a comment - simplified code; prepared for createCore
          Hide
          Ryan McKinley added a comment -

          I have not looked at the recent patches yet... but I'm still wondering if there is any value to "alias" if we have a SWAP command?

          http://www.nabble.com/purpose-of-MultiCore--22default-22---to14268755.html#a14427376

          Aliasing has me nervous about the maintining a unique ID and a name - it seems to just lead to a management/clarity problem.

          Show
          Ryan McKinley added a comment - I have not looked at the recent patches yet... but I'm still wondering if there is any value to "alias" if we have a SWAP command? http://www.nabble.com/purpose-of-MultiCore--22default-22---to14268755.html#a14427376 Aliasing has me nervous about the maintining a unique ID and a name - it seems to just lead to a management/clarity problem.
          Hide
          Henri Biestro added a comment -

          backup of my local state: added untested code (create core dynamically), still some thinking to be done (using CoreDescriptor as vehicle for multicore serialization in&out?)

          As for the added complexity versus the swap command, I believe the potential functional benefits make it worth it.
          Using the URL (and not parameters) to carry information is good practice and seems like an appropriate rationale; for instance, using the 'alias' through the URL to map query behaviors (be it, filtered queries, query parsers, etc) would open to easy ways to fit per-user/usage profiles behaviors.

          And I think we can be informative enough on misconfiguration so users know exactly where the error sit.

          Show
          Henri Biestro added a comment - backup of my local state: added untested code (create core dynamically), still some thinking to be done (using CoreDescriptor as vehicle for multicore serialization in&out?) As for the added complexity versus the swap command, I believe the potential functional benefits make it worth it. Using the URL (and not parameters) to carry information is good practice and seems like an appropriate rationale; for instance, using the 'alias' through the URL to map query behaviors (be it, filtered queries, query parsers, etc) would open to easy ways to fit per-user/usage profiles behaviors. And I think we can be informative enough on misconfiguration so users know exactly where the error sit.
          Hide
          Ryan McKinley added a comment - - edited

          > (be it, filtered queries, query parsers, etc) would open to easy ways to fit per-user/usage profiles behaviors.
          >

          Are you saying there is a big win if you can get stats on:
          http://host/henri/select vs http://host/ryan/select
          when 'henri' and 'ryan' are both aliased to 'core1'? Perhaps? but mod_rewrite can do that and much much more (if you really wanted to).

          With the alias model, how would you reindex a running core and end up with an identical setup at the end? Unless I'm missing something, the new core would need a different name (id), and there would be a brief moment where the main core was not avaliable

          consider:
          <core name="core0" alias="main" ... />

          and all queries come to solr as:
          http://host/solr/main/...

          I would have to run:
          1. LOAD core1 using same config as core0
          2. send add commands to core1
          3. UNALIAS "main" from core0
          (now nothing is available at /main)
          4. ALIAS "main" to core1
          5. UNLOAD core0
          (now the persisted configuration is different then when we started but should not be)

          Show
          Ryan McKinley added a comment - - edited > (be it, filtered queries, query parsers, etc) would open to easy ways to fit per-user/usage profiles behaviors. > Are you saying there is a big win if you can get stats on: http://host/henri/select vs http://host/ryan/select when 'henri' and 'ryan' are both aliased to 'core1'? Perhaps? but mod_rewrite can do that and much much more (if you really wanted to). With the alias model, how would you reindex a running core and end up with an identical setup at the end? Unless I'm missing something, the new core would need a different name (id), and there would be a brief moment where the main core was not avaliable consider: <core name="core0" alias="main" ... /> and all queries come to solr as: http://host/solr/main/ ... I would have to run: 1. LOAD core1 using same config as core0 2. send add commands to core1 3. UNALIAS "main" from core0 (now nothing is available at /main) 4. ALIAS "main" to core1 5. UNLOAD core0 (now the persisted configuration is different then when we started but should not be)
          Hide
          Henri Biestro added a comment -

          If we are making a new index - a new index version-, it can mean the schema and the config can change; I may change my analysis chain or schema but also warming queries, cache set up, etc. The config is thus not necessarily the same.
          I may also want to have the new setup tested by a group of users before I make it available to the whole population; http://host/productionl versus http://host/stage. I might even have automated tests that verify that some queries do return some expected documents.
          If we were to use the 'alias' to map behaviors, it seems more convenient to declare those within Solr than anywhere else; describing that http://host/ryan/select queries on core main with an automated fq author='ryan' should not force mod-rewrite usage imho.
          Finally, the 'alias' command as it stands, allows to redefine an alias (without havng to unalias first) so the sequence would be:
          (considering <core name="core,0" alias="main" ... />)
          LOAD core,1 // which could even be aliased as 'stage' at this time
          send adds to core,1 // when done, could run verifications on 'stage'
          ALIAS core,1 main // 'swap' so to speak, overwrites previous 'main' alias
          UNLOAD core,0

          Show
          Henri Biestro added a comment - If we are making a new index - a new index version-, it can mean the schema and the config can change; I may change my analysis chain or schema but also warming queries, cache set up, etc. The config is thus not necessarily the same. I may also want to have the new setup tested by a group of users before I make it available to the whole population; http://host/productionl versus http://host/stage . I might even have automated tests that verify that some queries do return some expected documents. If we were to use the 'alias' to map behaviors, it seems more convenient to declare those within Solr than anywhere else; describing that http://host/ryan/select queries on core main with an automated fq author='ryan' should not force mod-rewrite usage imho. Finally, the 'alias' command as it stands, allows to redefine an alias (without havng to unalias first) so the sequence would be: (considering <core name="core,0" alias="main" ... />) LOAD core,1 // which could even be aliased as 'stage' at this time send adds to core,1 // when done, could run verifications on 'stage' ALIAS core,1 main // 'swap' so to speak, overwrites previous 'main' alias UNLOAD core,0
          Hide
          Ryan McKinley added a comment -

          > If we were to use the 'alias' to map behaviors,

          how would an alias map different behaviors? Alias just offer multiple ways to access the same core and the same behavior. RequestHandlers don't know what path requested them.

          My point about mod_rewrite was referring to the use case you referred to: making the log files easier to parse per user.

          Re production and stage, why do you need aliasing for that? each core has name - when 'stage' is ready – it can swap with 'production'

          > Finally, the 'alias' command as it stands, allows to redefine an alias (without havng to unalias first) so the sequence would be:
          > (considering <core name="core,0" alias="main" ... />)
          > LOAD core,1 // which could even be aliased as 'stage' at this time
          > send adds to core,1 // when done, could run verifications on 'stage'
          > ALIAS core,1 main // 'swap' so to speak, overwrites previous 'main' alias
          > UNLOAD core,0
          >

          so if you serialize at the beginning, you have:
          <core name="core,0" alias="main" ... />
          at the end you have:
          <core name="core,1" alias="main" ... />

          if you run that every hour, do you end up with "core,1000" or switch between them? This would require you ask MultiCore, what i the 'id' for the core sitting at 'main' before you can operate on it. Why add this complexity?

          Show
          Ryan McKinley added a comment - > If we were to use the 'alias' to map behaviors, how would an alias map different behaviors? Alias just offer multiple ways to access the same core and the same behavior. RequestHandlers don't know what path requested them. My point about mod_rewrite was referring to the use case you referred to: making the log files easier to parse per user. Re production and stage, why do you need aliasing for that? each core has name - when 'stage' is ready – it can swap with 'production' > Finally, the 'alias' command as it stands, allows to redefine an alias (without havng to unalias first) so the sequence would be: > (considering <core name="core,0" alias="main" ... />) > LOAD core,1 // which could even be aliased as 'stage' at this time > send adds to core,1 // when done, could run verifications on 'stage' > ALIAS core,1 main // 'swap' so to speak, overwrites previous 'main' alias > UNLOAD core,0 > so if you serialize at the beginning, you have: <core name="core,0" alias="main" ... /> at the end you have: <core name="core,1" alias="main" ... /> if you run that every hour, do you end up with "core,1000" or switch between them? This would require you ask MultiCore, what i the 'id' for the core sitting at 'main' before you can operate on it. Why add this complexity?
          Hide
          Henri Biestro added a comment -

          RequestHandlers do not today know the path that requested them;I was merely proposing a possible functional extension through usage of aliases.
          As for core names, being able to carry which version/revision of the config/schema is in use is imho not complex and useful to many (using svn/cvs/webdav to store config/schema)
          Anyway, the 'aliases' idea is definitely not something you did find useful enough from the beginning and I'm obviously failing to make the case for it. Alas.

          Show
          Henri Biestro added a comment - RequestHandlers do not today know the path that requested them;I was merely proposing a possible functional extension through usage of aliases. As for core names, being able to carry which version/revision of the config/schema is in use is imho not complex and useful to many (using svn/cvs/webdav to store config/schema) Anyway, the 'aliases' idea is definitely not something you did find useful enough from the beginning and I'm obviously failing to make the case for it. Alas.
          Hide
          Ryan McKinley added a comment -

          > RequestHandlers do not today know the path that requested them;

          aaah – so if we need it later, we could add aliasing then?

          > is imho not complex and useful to many (using svn/cvs/webdav to store config/schema)

          How does aliasing change this. What can you do that you could not do without it? I store my config/schema in svn and don't have any problems.

          > Anyway, the 'aliases' idea is definitely not something you did find useful enough from the beginning

          If I understood what you gain, I could be convinced. Right now I just see it as the need to manage and maintain multiple names+one immutable name without any reason.

          Perhaps we can move forward without aliasing, and add it later if we find (and implement) a solid use case for it.

          Show
          Ryan McKinley added a comment - > RequestHandlers do not today know the path that requested them; aaah – so if we need it later, we could add aliasing then? > is imho not complex and useful to many (using svn/cvs/webdav to store config/schema) How does aliasing change this. What can you do that you could not do without it? I store my config/schema in svn and don't have any problems. > Anyway, the 'aliases' idea is definitely not something you did find useful enough from the beginning If I understood what you gain, I could be convinced. Right now I just see it as the need to manage and maintain multiple names+one immutable name without any reason. Perhaps we can move forward without aliasing, and add it later if we find (and implement) a solid use case for it.
          Hide
          Ryan McKinley added a comment -

          Here is a patch that cleans up some naming and implements the SWAP command.

          It does not include the persistence stuff in the latest solr-350.patch

          Henri - how do you feel about committing this, then implementing persistence in a smaller patch?

          Show
          Ryan McKinley added a comment - Here is a patch that cleans up some naming and implements the SWAP command. It does not include the persistence stuff in the latest solr-350.patch Henri - how do you feel about committing this, then implementing persistence in a smaller patch?
          Hide
          Henri Biestro added a comment -

          SWAP is an important feature to exploit multicore & persistence is not production ready yet, so committing feels like the next logical step .
          Ryan, if possible, I'd appreciate and would greatly benefit from a quick/early review of the solr-315.patch peristence & core creation code (XmWriter, CoreDescriptor; keep them or loose them?).

          As an upside on the ALIAS discussion, if & when a use case shows up, I guess we will be ready!

          Show
          Henri Biestro added a comment - SWAP is an important feature to exploit multicore & persistence is not production ready yet, so committing feels like the next logical step . Ryan, if possible, I'd appreciate and would greatly benefit from a quick/early review of the solr-315.patch peristence & core creation code (XmWriter, CoreDescriptor; keep them or loose them?). As an upside on the ALIAS discussion, if & when a use case shows up, I guess we will be ready!
          Hide
          Ryan McKinley added a comment -

          just committed SOLR-350-Naming.patch

          >
          > Ryan, if possible, I'd appreciate and would greatly benefit from a quick/early review of the solr-315.patch peristence & core creation code (XmWriter, CoreDescriptor; keep them or loose them?).
          >

          I gave it a quick look this morning, but did not look too closely because all the 'alias' stuff

          XmWriter and CoreDescriptor seem reasonable to me. The CoreDescriptor could be used to move both Config and Schema away from knowing what file opened them. Check SOLR-427

          Show
          Ryan McKinley added a comment - just committed SOLR-350 -Naming.patch > > Ryan, if possible, I'd appreciate and would greatly benefit from a quick/early review of the solr-315.patch peristence & core creation code (XmWriter, CoreDescriptor; keep them or loose them?). > I gave it a quick look this morning, but did not look too closely because all the 'alias' stuff XmWriter and CoreDescriptor seem reasonable to me. The CoreDescriptor could be used to move both Config and Schema away from knowing what file opened them. Check SOLR-427
          Hide
          Henri Biestro added a comment -

          On aliases - for completeness - , I had this "nagging" thought I was missing something...
          Re-reading Hoss's proposal and crossing that with the 1000 unique names point you made, there is in any case 1000 unique 'instanceDir' that need to be provided; Hoss proposed to use the 'instanceDir' instead of a name and alias that if I'm not mistaken.
          I got side tracked by the fact that the instanceDir could be absolute which would have introduced a deployment host 'hard' dependency and lost the equivalence.
          If we define an 'instanceRoot' (at the multicore level or at the core level) and make the (core) instanceDir = instanceRoot + '/' + name, the uniqueness of the core name would be put to its initial intended use (instead of just being a by-product of the alias feature). In that case, at least one alias is convenient so we can keep the 'url' constant across index revisions.
          For instance, if you are using svn, you could have you instanceDir/

          {schema, conf}

          versioned; when you have a new revision ready to go, you copy these over using the instanceDir+","+revision-number and use that as a name (which isn't too bad of a convention).
          And then, there are maybe future features that could be added to use aliases for other purpose...
          Oh well...

          Show
          Henri Biestro added a comment - On aliases - for completeness - , I had this "nagging" thought I was missing something... Re-reading Hoss's proposal and crossing that with the 1000 unique names point you made, there is in any case 1000 unique 'instanceDir' that need to be provided; Hoss proposed to use the 'instanceDir' instead of a name and alias that if I'm not mistaken. I got side tracked by the fact that the instanceDir could be absolute which would have introduced a deployment host 'hard' dependency and lost the equivalence. If we define an 'instanceRoot' (at the multicore level or at the core level) and make the (core) instanceDir = instanceRoot + '/' + name, the uniqueness of the core name would be put to its initial intended use (instead of just being a by-product of the alias feature). In that case, at least one alias is convenient so we can keep the 'url' constant across index revisions. For instance, if you are using svn, you could have you instanceDir/ {schema, conf} versioned; when you have a new revision ready to go, you copy these over using the instanceDir+","+revision-number and use that as a name (which isn't too bad of a convention). And then, there are maybe future features that could be added to use aliases for other purpose... Oh well...
          Hide
          Henri Biestro added a comment -

          updated for trunk 611834;
          improved code related to configuration wrt absolute/relative locations: allows core dataDir/instanceDir to be absolute or relative to multicore (pseudo) instanceDir/dataDir.
          added a 'dataDir' attribute at the multicore.xml level so that all core data directories can be made relative to it (when they are not absolute).
          After much consideration, added CoreDescriptor/XmlWriter classes; the former describe cores (makes it easier to manage/persist cores and eventually extend behavior - variables...), the latter is (an overkill to) persist XML (ala java6 XmlWriter).

          Show
          Henri Biestro added a comment - updated for trunk 611834; improved code related to configuration wrt absolute/relative locations: allows core dataDir/instanceDir to be absolute or relative to multicore (pseudo) instanceDir/dataDir. added a 'dataDir' attribute at the multicore.xml level so that all core data directories can be made relative to it (when they are not absolute). After much consideration, added CoreDescriptor/XmlWriter classes; the former describe cores (makes it easier to manage/persist cores and eventually extend behavior - variables...), the latter is (an overkill to) persist XML (ala java6 XmlWriter).
          Hide
          Ryan McKinley added a comment -

          Hi Henri-

          We're getting there.... but I had trouble applying this patch, can you post a new one with a few changes?

          1. can you change your editor settings to use two spaces rather then tabs? In general, solr code should have two spaces rather then tabs or 4 spaces.

          2. To avoid confusion with o.a.s.request.XMLWriter, can we call XmlWriter something else? XmlWriterHelper? XmlWriterUtils?

          3. Can we make XmlWriter a package protected class in o.a.s.core? This way we don't have to make it part of the public API. If there is a need for it later, we can easily move it. Also, if it can be replaced with an off the shelf library, we can do that later without mucking anyone up.

          Thanks for your work and patience with this!

          Show
          Ryan McKinley added a comment - Hi Henri- We're getting there.... but I had trouble applying this patch, can you post a new one with a few changes? 1. can you change your editor settings to use two spaces rather then tabs? In general, solr code should have two spaces rather then tabs or 4 spaces. 2. To avoid confusion with o.a.s.request.XMLWriter, can we call XmlWriter something else? XmlWriterHelper? XmlWriterUtils? 3. Can we make XmlWriter a package protected class in o.a.s.core? This way we don't have to make it part of the public API. If there is a need for it later, we can easily move it. Also, if it can be replaced with an off the shelf library, we can do that later without mucking anyone up. Thanks for your work and patience with this!
          Hide
          Henri Biestro added a comment -

          changed persistence to use o.a.s.common.util.XML (removed XmlWriter);
          updated multicore params/solrj to reflect full set of core creation parameters;
          modified multicore tests to use a clean copy of multicore.xml (multicore-base.xml) before running and made dataDir point to $

          {CWD}

          /solr-350 to avoid environment pollution;

          patch produced on Solaris 10 by:
          svn diff --diff-cmd /usr/local/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-350.patch
          can be applied with:
          /usr/local/bin/patch -u -p 0 < ~/solr-350.patch

          Show
          Henri Biestro added a comment - changed persistence to use o.a.s.common.util.XML (removed XmlWriter); updated multicore params/solrj to reflect full set of core creation parameters; modified multicore tests to use a clean copy of multicore.xml (multicore-base.xml) before running and made dataDir point to $ {CWD} /solr-350 to avoid environment pollution; patch produced on Solaris 10 by: svn diff --diff-cmd /usr/local/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-350.patch can be applied with: /usr/local/bin/patch -u -p 0 < ~/solr-350.patch
          Hide
          Ryan McKinley added a comment -

          Looking good. I took your patch and removed all the 'default' stuff to make it in line with Hoss' observations in:
          http://www.nabble.com/Re%3A-purpose-of-MultiCore-%22default%22---p14591921.html

          This adds the dispatcher settings to multicore.xml

          <multicore adminPath="/admin/multicore" dataDir="alldata" persistent="true" >
            <abortOnConfigurationError>true</abortOnConfigurationError>
            <requestDispatcher handleSelect="true" >
              <requestParsers enableRemoteStreaming="false" multipartUploadLimitInKB="2048" />
            </requestDispatcher>
          
            <core name="core0" instanceDir="core0" default="true"/>
            <core name="core1" instanceDir="core1"/>
          </multicore>
          

          The one thing we need to change before commiting is how the test work with multicore-base.xml and multicore.xml – maybe the 'clean' copy should live in the test files and get copied over on a shutdown hook? We want to make sure everythign in the /examples directory helps people understand how things work.

          Show
          Ryan McKinley added a comment - Looking good. I took your patch and removed all the 'default' stuff to make it in line with Hoss' observations in: http://www.nabble.com/Re%3A-purpose-of-MultiCore-%22default%22---p14591921.html This adds the dispatcher settings to multicore.xml <multicore adminPath= "/admin/multicore" dataDir= "alldata" persistent= "true" > <abortOnConfigurationError> true </abortOnConfigurationError> <requestDispatcher handleSelect= "true" > <requestParsers enableRemoteStreaming= "false" multipartUploadLimitInKB= "2048" /> </requestDispatcher> <core name= "core0" instanceDir= "core0" default= "true" /> <core name= "core1" instanceDir= "core1" /> </multicore> The one thing we need to change before commiting is how the test work with multicore-base.xml and multicore.xml – maybe the 'clean' copy should live in the test files and get copied over on a shutdown hook? We want to make sure everythign in the /examples directory helps people understand how things work.
          Hide
          Hoss Man added a comment -

          hey guys ... i'm catching up on some Jira reading and this comment jumped out at me...

          improved code related to configuration wrt absolute/relative locations: allows core dataDir/instanceDir to be absolute or relative to multicore (pseudo) instanceDir/dataDir.

          (i'm guessing that's what the dataDir option in the <multicore/> tag in Ryan's example is for?)

          this seems like a bad idea to me ... violating the principle of least suprise and all. it will make the behavior of a solrconfig.xml file dependent on whether or not it's being used in a multicore context or not.

          I'd like to suggest that an aternate approach would be to generalize the current system property based variable substitution to support arbitrary key=val pairs specified when the SolrCore is constructed...

          • we add new syntax to multicore.xml for declaring "global properties"
          • MultiCore converts these global declarations into a SolrParams instance
          • we also add syntax to multicore.xml for declaring properties specific to a core.
          • when MultiCore instantiates a core, it uses DefaultSolrParams to let the specific properties override the global properties and to set a special property containing the name of the core (ie: "solr.core.name")
          • if cloning a core is possible (i can't remember) MultiCore would reuse the SolrParams from the source core, changing only the core name property (solr.core.name)
          • system properties with the same names as properties in multicore.xml would trump anything from the configs (since they are a run time overrides)
          <multicore adminPath="/admin/multicore" persistent="true" >
            <abortOnConfigurationError>true</abortOnConfigurationError>
            <requestDispatcher handleSelect="true" >
              <requestParsers enableRemoteStreaming="false" multipartUploadLimitInKB="2048" />
            </requestDispatcher>
            <property name="alldata.dir">/my/solr/basedir</property>
            <property name="magicnumber">32</property>
          
            <!-- core0 gets props above, any other props in it's configs must come from system props -->
            <core name="core0" instanceDir="core0" />
            <core name="core1" instanceDir="core1">
               <property name="dataDir">foo</property>
            </core>
            <core name="core111" instanceDir="core1"><!-- note same instanceDir as above-->
               <!-- can reuse exact same instance dir as another core ${solr.core.name} will be differnet -->
               <property name="dataDir">bar</property>
               <!-- and now ${dataDir} will be different too -->
            </core>
          </multicore>
          

          This would not only give us the ability to have a common $

          {alldata.dir}

          for all cores, but also an easy way to reuse the same solrconfig.xml for multiple cores and still get subtle changes in behavior – all while making it transparent what any one solrconfig.xml will do.

          Super powerful – and (i think) pretty easy to implement... a new optional SolrParams arg to the SolrCore, SolrConfig, and Config constructors, and DOMUtil.substituteSystemProperties plus some code in MultiCore to create the SolrParams (hmm, DOMUtil doesn't have a very friendly method for that yet, not that big a deal though)

          what do you think?

          Show
          Hoss Man added a comment - hey guys ... i'm catching up on some Jira reading and this comment jumped out at me... improved code related to configuration wrt absolute/relative locations: allows core dataDir/instanceDir to be absolute or relative to multicore (pseudo) instanceDir/dataDir. (i'm guessing that's what the dataDir option in the <multicore/> tag in Ryan's example is for?) this seems like a bad idea to me ... violating the principle of least suprise and all. it will make the behavior of a solrconfig.xml file dependent on whether or not it's being used in a multicore context or not. I'd like to suggest that an aternate approach would be to generalize the current system property based variable substitution to support arbitrary key=val pairs specified when the SolrCore is constructed... we add new syntax to multicore.xml for declaring "global properties" MultiCore converts these global declarations into a SolrParams instance we also add syntax to multicore.xml for declaring properties specific to a core. when MultiCore instantiates a core, it uses DefaultSolrParams to let the specific properties override the global properties and to set a special property containing the name of the core (ie: "solr.core.name") if cloning a core is possible (i can't remember) MultiCore would reuse the SolrParams from the source core, changing only the core name property (solr.core.name) system properties with the same names as properties in multicore.xml would trump anything from the configs (since they are a run time overrides) <multicore adminPath= "/admin/multicore" persistent= " true " > <abortOnConfigurationError> true </abortOnConfigurationError> <requestDispatcher handleSelect= " true " > <requestParsers enableRemoteStreaming= " false " multipartUploadLimitInKB= "2048" /> </requestDispatcher> <property name= "alldata.dir" >/my/solr/basedir</property> <property name= "magicnumber" >32</property> <!-- core0 gets props above, any other props in it's configs must come from system props --> <core name= "core0" instanceDir= "core0" /> <core name= "core1" instanceDir= "core1" > <property name= "dataDir" >foo</property> </core> <core name= "core111" instanceDir= "core1" ><!-- note same instanceDir as above--> <!-- can reuse exact same instance dir as another core ${solr.core.name} will be differnet --> <property name= "dataDir" >bar</property> <!-- and now ${dataDir} will be different too --> </core> </multicore> This would not only give us the ability to have a common $ {alldata.dir} for all cores, but also an easy way to reuse the same solrconfig.xml for multiple cores and still get subtle changes in behavior – all while making it transparent what any one solrconfig.xml will do. Super powerful – and (i think) pretty easy to implement... a new optional SolrParams arg to the SolrCore, SolrConfig, and Config constructors, and DOMUtil.substituteSystemProperties plus some code in MultiCore to create the SolrParams (hmm, DOMUtil doesn't have a very friendly method for that yet, not that big a deal though) what do you think?
          Hide
          Hoss Man added a comment -

          Actually, one more unrelated comment...

          Looking good. I took your patch and removed all the 'default' stuff to make it in line with Hoss' observations in:

          ...i know i suggested moving anything related to the entire solr server into multicore.xml, but i've been looking at SolrDispatchFilter lately because of SOLR-127 and i'm starting to wonder if the <requestDispatcher/> config options really need to be webapp wide.

          They are (currently) only used to construct a protected instance of SolrRequestParsers in SolrDispatchFilter.init, but that SolrRequestParsers is only needed in the doFilter method once we've already figured out what core we're using ... it's a fairly light weight class, so why not construct a new one in each call to doFilter (after we've determined the correct core) and leave those options core specific?

          (not to mention the HTTP caching options SOLR-127 is probably going to add to <requestDispatcher/>)

          ...

          And while i'm thinking about it ... what does abortOnConfigurationError=true mean in a multicore world when someone attempts to dynamicly load a core with a config error?

          Currently SolrDispatchFilter only looks at that setting on init ... is MultiCore goung to start checking it after each LOAD core action? will it cause the whole server to stop accepting requests or just do something special for that one core?

          Show
          Hoss Man added a comment - Actually, one more unrelated comment... Looking good. I took your patch and removed all the 'default' stuff to make it in line with Hoss' observations in: ...i know i suggested moving anything related to the entire solr server into multicore.xml, but i've been looking at SolrDispatchFilter lately because of SOLR-127 and i'm starting to wonder if the <requestDispatcher/> config options really need to be webapp wide. They are (currently) only used to construct a protected instance of SolrRequestParsers in SolrDispatchFilter.init, but that SolrRequestParsers is only needed in the doFilter method once we've already figured out what core we're using ... it's a fairly light weight class, so why not construct a new one in each call to doFilter (after we've determined the correct core) and leave those options core specific? (not to mention the HTTP caching options SOLR-127 is probably going to add to <requestDispatcher/>) ... And while i'm thinking about it ... what does abortOnConfigurationError=true mean in a multicore world when someone attempts to dynamicly load a core with a config error? Currently SolrDispatchFilter only looks at that setting on init ... is MultiCore goung to start checking it after each LOAD core action? will it cause the whole server to stop accepting requests or just do something special for that one core?
          Hide
          Henri Biestro added a comment -

          Regarding introducing variables, this is tempting but this looks like a rather important feature for a rather limited need. Plus it could be argued that it increases the element of surprise or at least the potential for side effects.
          If a solrconfig/schema refers to a variable that can be superseded in a multicore.xml, the behavior of a core is explictly dependant on whether it is loaded in a multicore configuration of not. I agree that being explicit rather than implicit is better but this does modify behavior even deeper nevertheless.

          The door that variable introduction would open seems much wider than the functional hole is; the original "breach" was needed for the shared class loader, a common dataDir root is adressing the good practise to segregate data from configuration. We could introduce a configDir/schemaDir at multicore level to adress sharing config/schema sharing - although using multiple cores is usually related to different config/schema so reusing/sharing them does not look like a must-have feature.

          The multicore dataDir attributes is a default directory/roots that can be overriden by core definitions, the current convention is really limited in its effects to what's needed. Variables and the huge functional potential of a whole environment defined within Solr seem way beyond the current use-cases; if we follow the precedent of "alias vs swap", we should retain the idea but wait till more needs emerge before implementing it, shouldn't we?

          Show
          Henri Biestro added a comment - Regarding introducing variables, this is tempting but this looks like a rather important feature for a rather limited need. Plus it could be argued that it increases the element of surprise or at least the potential for side effects. If a solrconfig/schema refers to a variable that can be superseded in a multicore.xml, the behavior of a core is explictly dependant on whether it is loaded in a multicore configuration of not. I agree that being explicit rather than implicit is better but this does modify behavior even deeper nevertheless. The door that variable introduction would open seems much wider than the functional hole is; the original "breach" was needed for the shared class loader, a common dataDir root is adressing the good practise to segregate data from configuration. We could introduce a configDir/schemaDir at multicore level to adress sharing config/schema sharing - although using multiple cores is usually related to different config/schema so reusing/sharing them does not look like a must-have feature. The multicore dataDir attributes is a default directory/roots that can be overriden by core definitions, the current convention is really limited in its effects to what's needed. Variables and the huge functional potential of a whole environment defined within Solr seem way beyond the current use-cases; if we follow the precedent of "alias vs swap", we should retain the idea but wait till more needs emerge before implementing it, shouldn't we?
          Hide
          Hoss Man added a comment -

          I agree generalizing variables is somewhat significant, and a larger scope then just what's being talked about here – perhaps that's part of the disconnect ... I'm taking it as a given that it's a problem that needs to be solved before multicores can really be useful – so if we have to solve that problem, and that solution can also solve the common dataDir problem, let's not have an alternate solution to hte dataDir problem that is "non transparent" to people reading the configs.

          (my assumption being based on the impression that we can't really support a lot of the use cases people have talked about without having at a minimum a way to know use the "name" of the current core as a variable in the configs – postCommit hooks being one example of a place where this info will be crucial)

          In a nutshell: if we know we are going to need variables, then instead of introducing a new <multicore dataDir="..."> option now (which if used changes the meaning of the <dataDir/>) let's solve the broader problem of passing arbitrary variables to a SolrCore. we can still commit all of the other stuff you guys have been working on, lets just set the dataDir issue aside until we add the variable support.

          BUT!!! part of your comment has me worried that i'm misunderstanding how <multicore dataDir="..."> works, you just said...

          The multicore dataDir attributes is a default directory/roots that can be overriden by core definitions

          ...how can it be overridden? My understanding based on your early comment was that <multicore dataDir="..."> was the directory that the <dataDir>...</dataDir> options in each solrconfig.xml would be relative to ...do you mean that in the multicore.xml file, each <core/> can have a dataDir option? ... if so that doesn't really solve the concern I have: people should be able to read a solrconfig.xml and understand when there are outside inflluences on that config...

          Plus it could be argued that it increases the element of surprise or at least the potential for side effects.

          If a solrconfig/schema refers to a variable that can be superseded in a multicore.xml, the behavior of a core is explictly dependant on whether it is loaded in a multicore configuration of not. I agree that being explicit rather than implicit is better but this does modify behavior even deeper nevertheless.

          I disagree ... it's true that using a "variables" approach the evaluation of a solrconfig.xml would be dependent on the environment it's run in (ie: is there a multicore.xml? are variables set in it? are any system properties set?) but the evaluation of solrconfog.xml is already dependent on it's environment (ie: what is the solr home? are any system properties set?) ... my point is that when a human is reading a config with variables in it, it is crystal clear that there is an environmental factor that will affect the behavior. If a person reads a solrconfig.xml that contains this line...

          <dataDir>${my.special.dir}/data</dataDir>
          

          ...then it's very obvious that the location of the data will depends on the environment the core is run in (in which "my.special.dir" must be set, either as a system property or as a multicore.xml variable – the point being it's an known external factor). The approach you guys have been talking about though (assuming i'm understanding it correctly) would take away that transparency – people could look at a solrconfig.xml that looks like this...

          <dataDir>data</dataDir>

          ...and that that could mean anything depending on whether or not this solrconfig.xml is running in a multicore setup or not.

          Show
          Hoss Man added a comment - I agree generalizing variables is somewhat significant, and a larger scope then just what's being talked about here – perhaps that's part of the disconnect ... I'm taking it as a given that it's a problem that needs to be solved before multicores can really be useful – so if we have to solve that problem, and that solution can also solve the common dataDir problem, let's not have an alternate solution to hte dataDir problem that is "non transparent" to people reading the configs. (my assumption being based on the impression that we can't really support a lot of the use cases people have talked about without having at a minimum a way to know use the "name" of the current core as a variable in the configs – postCommit hooks being one example of a place where this info will be crucial) In a nutshell: if we know we are going to need variables, then instead of introducing a new <multicore dataDir="..."> option now (which if used changes the meaning of the <dataDir/>) let's solve the broader problem of passing arbitrary variables to a SolrCore. we can still commit all of the other stuff you guys have been working on, lets just set the dataDir issue aside until we add the variable support. BUT!!! part of your comment has me worried that i'm misunderstanding how <multicore dataDir="..."> works, you just said... The multicore dataDir attributes is a default directory/roots that can be overriden by core definitions ...how can it be overridden? My understanding based on your early comment was that <multicore dataDir="..."> was the directory that the <dataDir>...</dataDir> options in each solrconfig.xml would be relative to ...do you mean that in the multicore.xml file, each <core/> can have a dataDir option? ... if so that doesn't really solve the concern I have: people should be able to read a solrconfig.xml and understand when there are outside inflluences on that config... Plus it could be argued that it increases the element of surprise or at least the potential for side effects. If a solrconfig/schema refers to a variable that can be superseded in a multicore.xml, the behavior of a core is explictly dependant on whether it is loaded in a multicore configuration of not. I agree that being explicit rather than implicit is better but this does modify behavior even deeper nevertheless. I disagree ... it's true that using a "variables" approach the evaluation of a solrconfig.xml would be dependent on the environment it's run in (ie: is there a multicore.xml? are variables set in it? are any system properties set?) but the evaluation of solrconfog.xml is already dependent on it's environment (ie: what is the solr home? are any system properties set?) ... my point is that when a human is reading a config with variables in it, it is crystal clear that there is an environmental factor that will affect the behavior. If a person reads a solrconfig.xml that contains this line... <dataDir>${my.special.dir}/data</dataDir> ...then it's very obvious that the location of the data will depends on the environment the core is run in (in which "my.special.dir" must be set, either as a system property or as a multicore.xml variable – the point being it's an known external factor). The approach you guys have been talking about though (assuming i'm understanding it correctly) would take away that transparency – people could look at a solrconfig.xml that looks like this... <dataDir>data</dataDir> ...and that that could mean anything depending on whether or not this solrconfig.xml is running in a multicore setup or not.
          Hide
          Henri Biestro added a comment -

          I'm confused and dont see the dataDir element parsing you are referring to in solrconfig.xml; my current understanding is that the dataDir is deduced from the instance dir if not specified explicitly at core construction time. Are you proposing to add it (and/or instanceDir) to solrconfig.xml?

          Anyway, the current patch code allows both dataDir & instanceDir to be specified as multicore & core attributes (and everything related to file/directory locations is contained within multicore.xml); it treats absolute directory specifications (ie starting with '/') as such, core specification having precedence over multicore.
          If the core specified instanceDir is absolute, it is used as is and the dataDir is made relative to it if not absolute.
          Otherwise, the instanceDir is relative to the multicore instanceDir; If the core specified dataDir is absolute, it is used as such otherwise the core dataDir is relative to the multicore dataDir.
          When left unspecified, everything behaves relative to the multicore implied instanceDir or as current defaults.

          If you still find this is a bad solution, I'm confident you & Ryan will agree on the good one; just let me know, I'll (try to) code it (if you want).

          Show
          Henri Biestro added a comment - I'm confused and dont see the dataDir element parsing you are referring to in solrconfig.xml; my current understanding is that the dataDir is deduced from the instance dir if not specified explicitly at core construction time. Are you proposing to add it (and/or instanceDir) to solrconfig.xml? Anyway, the current patch code allows both dataDir & instanceDir to be specified as multicore & core attributes (and everything related to file/directory locations is contained within multicore.xml); it treats absolute directory specifications (ie starting with '/') as such, core specification having precedence over multicore. If the core specified instanceDir is absolute, it is used as is and the dataDir is made relative to it if not absolute. Otherwise, the instanceDir is relative to the multicore instanceDir; If the core specified dataDir is absolute, it is used as such otherwise the core dataDir is relative to the multicore dataDir. When left unspecified, everything behaves relative to the multicore implied instanceDir or as current defaults. If you still find this is a bad solution, I'm confident you & Ryan will agree on the good one; just let me know, I'll (try to) code it (if you want).
          Hide
          Ryan McKinley added a comment -

          Apologies for the lack of input, I've been too sick to follow this thread.

          Henri - the <dataDir> element is in solrconfig.xml – check the example config, it lists:

           <dataDir>${solr.data.dir:./solr/data}</dataDir>
          

          I'm a bit torn on the proper direction from here - the flexibility of setting the dataDir from multicore.xml is really nice, it makes it really easy to share all the same configs, but change the data directory. However, if the dataDir is set in solrconfig,xml, what about the existing <datadir> within solrconfig.xml?

          The properties/variables solution seems interesting, but more then I think we need to take on right now.

          I'll post an updated patch that removes all dataDir configuration and then we can work from there.

          Show
          Ryan McKinley added a comment - Apologies for the lack of input, I've been too sick to follow this thread. Henri - the <dataDir> element is in solrconfig.xml – check the example config, it lists: <dataDir> ${solr.data.dir:./solr/data} </dataDir> I'm a bit torn on the proper direction from here - the flexibility of setting the dataDir from multicore.xml is really nice, it makes it really easy to share all the same configs, but change the data directory. However, if the dataDir is set in solrconfig,xml, what about the existing <datadir> within solrconfig.xml? The properties/variables solution seems interesting, but more then I think we need to take on right now. I'll post an updated patch that removes all dataDir configuration and then we can work from there.
          Hide
          Ryan McKinley added a comment -

          Updated patch that removes dataDir configuration. This also puts the requestParser configuration back within each core. Creating a new RequestParser is not a lightweight operation, so creating one for each request does not seem like a good idea. This keeps a WeakHashMap<SolrCore,RequestParser>

          Show
          Ryan McKinley added a comment - Updated patch that removes dataDir configuration. This also puts the requestParser configuration back within each core. Creating a new RequestParser is not a lightweight operation, so creating one for each request does not seem like a good idea. This keeps a WeakHashMap<SolrCore,RequestParser>
          Hide
          Hoss Man added a comment -

          just to clarify, i still haven't looked at the patch closely (I trust Ryan/Henri's judgment on bulk of the multicore implementation ... i mainly just want to sanity cehck the concepts and configs) ... but I have just a few follow up questions/clarifications about some of the issues i mentioned before...

          a) by "requestParser configuration back within each core" you mean all of the <requestDispatcher> configuration, correct? (currently requestParser and handleSelect ... likely to be httpCaching as well) i mainly just want to be sure that moving forward we think it makes sense for each solrconfig.xml to have it's own <requestDispatcher> section containing info on how the SolrDispatchFilter should deal with requests for the core using that config.

          b) (constructing a) SolrRequestParsers instance seems pretty lightweight to me ... is there any think specific you're worried about that i'm not noticing?

          c) should i open a separate issue for dealing with generalizing variables (and note that corename and dataDir are two prime use cases) ? it seems like that can definitely be dealt with after the bulk of the stuff in this issue is committed.

          d) anyone have any thoughts regarding my question about "abortOnConfigurationError" and what it should mean when dealing with dynamically loaded cores (i'm pretty sure right now it's ignored for any dynamically loaded cores ... i'm just wondering if that's what we want it to do)

          Show
          Hoss Man added a comment - just to clarify, i still haven't looked at the patch closely (I trust Ryan/Henri's judgment on bulk of the multicore implementation ... i mainly just want to sanity cehck the concepts and configs) ... but I have just a few follow up questions/clarifications about some of the issues i mentioned before... a) by "requestParser configuration back within each core" you mean all of the <requestDispatcher> configuration, correct? (currently requestParser and handleSelect ... likely to be httpCaching as well) i mainly just want to be sure that moving forward we think it makes sense for each solrconfig.xml to have it's own <requestDispatcher> section containing info on how the SolrDispatchFilter should deal with requests for the core using that config. b) (constructing a) SolrRequestParsers instance seems pretty lightweight to me ... is there any think specific you're worried about that i'm not noticing? c) should i open a separate issue for dealing with generalizing variables (and note that corename and dataDir are two prime use cases) ? it seems like that can definitely be dealt with after the bulk of the stuff in this issue is committed. d) anyone have any thoughts regarding my question about "abortOnConfigurationError" and what it should mean when dealing with dynamically loaded cores (i'm pretty sure right now it's ignored for any dynamically loaded cores ... i'm just wondering if that's what we want it to do)
          Hide
          Henri Biestro added a comment -

          Regarding c/variables/properties, imho we can definitely tackle the bulk of it in here, no need for another issue yet.

          On that topic, one small nag regarding multicore.xml serialization; do we want multicore.xml serialization to retain expressions if any (ie serailize them back as expressions) or not? Seems like it would be convenient to be able to distribute the same multicore.xml across several hosts - which may have different envs.
          As of now, we do expand all expressions before parsing resource files; if multicore.xml uses expressions based on environment variables, these will be expanded before we even have a chance to see them which precludes being able to write them back.
          Since we will have to serialize variables in multicore.xml, one workaround would be for users to declare local variables for each env based expressions (as multicore "global" properties) and only use those locals (keeping those definitions before expansion that is). Parsing multicore.xml would make one pass before expansion to extract the 'multicore/property' & 'core/property' raw expressions, then expand the whole.
          (implementation/self note: MultiCore & CoreDescriptor need to be able to define/serialize properties).

          Would this be ok / needed? Thoughts ?

          Show
          Henri Biestro added a comment - Regarding c/variables/properties, imho we can definitely tackle the bulk of it in here, no need for another issue yet. On that topic, one small nag regarding multicore.xml serialization; do we want multicore.xml serialization to retain expressions if any (ie serailize them back as expressions) or not? Seems like it would be convenient to be able to distribute the same multicore.xml across several hosts - which may have different envs. As of now, we do expand all expressions before parsing resource files; if multicore.xml uses expressions based on environment variables, these will be expanded before we even have a chance to see them which precludes being able to write them back. Since we will have to serialize variables in multicore.xml, one workaround would be for users to declare local variables for each env based expressions (as multicore "global" properties) and only use those locals (keeping those definitions before expansion that is). Parsing multicore.xml would make one pass before expansion to extract the 'multicore/property' & 'core/property' raw expressions, then expand the whole. (implementation/self note: MultiCore & CoreDescriptor need to be able to define/serialize properties). Would this be ok / needed? Thoughts ?
          Hide
          Ryan McKinley added a comment -

          > Regarding c/variables/properties, imho we can definitely tackle the bulk of it in here, no need for another issue yet.
          >

          I think we should try to wrap up thins without properties, then open a new issue for them. They are functionally different enough. As a note, I'm using this multicore patch with system variables for the data path in each solrconfig.xml – this gives the same behavior you were looking for. <dataDir>$

          {solr.data}

          /corename/</dataDir>

          In my view the one thing we need to fix before getting this patch commited is the returning results for unloaded cores...
          http://www.nabble.com/Multicore---Querying-unloaded-core-returns-results-from-default-td15469303.html

          Show
          Ryan McKinley added a comment - > Regarding c/variables/properties, imho we can definitely tackle the bulk of it in here, no need for another issue yet. > I think we should try to wrap up thins without properties, then open a new issue for them. They are functionally different enough. As a note, I'm using this multicore patch with system variables for the data path in each solrconfig.xml – this gives the same behavior you were looking for. <dataDir>$ {solr.data} /corename/</dataDir> In my view the one thing we need to fix before getting this patch commited is the returning results for unloaded cores... http://www.nabble.com/Multicore---Querying-unloaded-core-returns-results-from-default-td15469303.html
          Hide
          Henri Biestro added a comment -

          We need a way to define a global data root without having to define a system env variable; can't we at least reintroduce the dataDir as a multicore attribute?
          The previous patch version went too far and was ignoring the solrconfig.xml dataDir specification, but having no way to describe where all data go easily is really too inconvenient.
          Can't we find something acceptable in between ?
          Strawman solution would be, if dataDir is not specified in solrconfig.xml, use the previous patch code ?
          Hopefully more acceptable, only provide a minimum set of variables with no possibility to define any for now ? The env would only contain 'solr.multicore.

          {home,data}

          ' and for each core,'solr.multicore.core.instance' (I'm reluctant to expose 'sol.multicore.core.name', explanation follows...)

          This would not preclude extending variables later and would not delay solr-350 by much now.

          We used {{<dataDir>$

          {solr.data}

          /corename/</dataDir>}} to illustrate the variable solution but I grow feeling uneasy seeing the core name as a variable part of a path (explicit or implicit): if we issue a SWAP command, how do we end up in a proper state when we stop/start the container without swapping the directory contents as well ?

          My rationale is that the instanceDir is really what physically identifies a core in a persistent manner wrt SWAP/stop/start; when we specify a data root, the data directory should somehow depend on the instanceDir as well.
          For instance, with <core name="books" instanceDir="books,0'.../> and <core name="books-dev" instanceDir="books,1".../> ; even if both use the same data root '/solr/data', the 'books' core will use '/solr/data/books,0/' as dataDir and 'books-dev' will use '/solr/data/books,1'.
          When we swap('books', 'books-dev') , everything is still ok; 'books' now refers to_'/solr/data/books,1'_ and books-dev refers to '/solr/data/books,0/' . If we stop/start the container, since nothing physically persistent depended on the name, variable substitution (or implicit expansion) can not interfere.
          If we are using the core name to build data directories, issuing swap is likely to break something...

          Please correct me if I'm deeply misunderstanding something...

          Show
          Henri Biestro added a comment - We need a way to define a global data root without having to define a system env variable; can't we at least reintroduce the dataDir as a multicore attribute? The previous patch version went too far and was ignoring the solrconfig.xml dataDir specification, but having no way to describe where all data go easily is really too inconvenient. Can't we find something acceptable in between ? Strawman solution would be, if dataDir is not specified in solrconfig.xml, use the previous patch code ? Hopefully more acceptable, only provide a minimum set of variables with no possibility to define any for now ? The env would only contain 'solr.multicore. {home,data} ' and for each core,'solr.multicore.core.instance' (I'm reluctant to expose 'sol.multicore.core.name', explanation follows...) This would not preclude extending variables later and would not delay solr-350 by much now. We used {{<dataDir>$ {solr.data} /corename/</dataDir>}} to illustrate the variable solution but I grow feeling uneasy seeing the core name as a variable part of a path (explicit or implicit): if we issue a SWAP command, how do we end up in a proper state when we stop/start the container without swapping the directory contents as well ? My rationale is that the instanceDir is really what physically identifies a core in a persistent manner wrt SWAP/stop/start; when we specify a data root, the data directory should somehow depend on the instanceDir as well. For instance, with <core name="books" instanceDir="books,0'.../> and <core name="books-dev" instanceDir="books,1".../> ; even if both use the same data root '/solr/data' , the 'books' core will use '/solr/data/books,0/' as dataDir and 'books-dev' will use '/solr/data/books,1' . When we swap('books', 'books-dev') , everything is still ok; 'books' now refers to_'/solr/data/books,1'_ and books-dev refers to '/solr/data/books,0/' . If we stop/start the container, since nothing physically persistent depended on the name, variable substitution (or implicit expansion) can not interfere. If we are using the core name to build data directories, issuing swap is likely to break something... Please correct me if I'm deeply misunderstanding something...
          Hide
          Otis Gospodnetic added a comment -

          I haven't followed the patches, and I quickly read through the last month's worth of comments here. One thing that Hoss said caught my attention:

          "...easy way to reuse the same solrconfig.xml for multiple cores and still
          get subtle changes in behavior - all while making it transparent what
          any one solrconfig.xml will do..."

          Please count this as my +1 for this.
          Yes, one use case is that each core is unique and thus needs unique configs, but I also have a concrete use case where all cores are identical as far as the configs go, all that needs to be different is the data directory where the index lives. In this case, it would be ideal if one could have a single copy of the schema.xml and solrconfig.xml, and specify core-specific settings (e.g. data/index dir) in multicore.xml.

          It would be even better if configs for cores were not all in a single/monolithic file - imagine a situation where you have thousands or even tends of thousands of indices and you add a few hundred or a few thousand new ones every day, throughout the day. You could certainly regenerate the whole multicore.xml file every time a new index is added, but it would be much more efficient to generate just the descriptor for that single new index that was just created, and tell Solr - "hey, look here, there is a new core/index you need to be aware of". Perhaps one way to deal with this is to expose an API (URL) to send such a "hey, look here...." message to Solr, and let Solr periodically write out multicore.xml to disk.

          Show
          Otis Gospodnetic added a comment - I haven't followed the patches, and I quickly read through the last month's worth of comments here. One thing that Hoss said caught my attention: "...easy way to reuse the same solrconfig.xml for multiple cores and still get subtle changes in behavior - all while making it transparent what any one solrconfig.xml will do..." Please count this as my +1 for this. Yes, one use case is that each core is unique and thus needs unique configs, but I also have a concrete use case where all cores are identical as far as the configs go, all that needs to be different is the data directory where the index lives. In this case, it would be ideal if one could have a single copy of the schema.xml and solrconfig.xml, and specify core-specific settings (e.g. data/index dir) in multicore.xml. It would be even better if configs for cores were not all in a single/monolithic file - imagine a situation where you have thousands or even tends of thousands of indices and you add a few hundred or a few thousand new ones every day, throughout the day. You could certainly regenerate the whole multicore.xml file every time a new index is added, but it would be much more efficient to generate just the descriptor for that single new index that was just created, and tell Solr - "hey, look here, there is a new core/index you need to be aware of". Perhaps one way to deal with this is to expose an API (URL) to send such a "hey, look here...." message to Solr, and let Solr periodically write out multicore.xml to disk.
          Hide
          Henri Biestro added a comment - - edited

          Otis, reading your requirements, I'd be considering using a Solr core (the "metacore") to handle an indexed version of multicore.xml; if you have a few thousands indices, it might be convenient to use queries in some occasions to select/retrieve/operate on one/many of them.
          The xml version of the multicore persistent file could be written at application/multicore shutdown and the Lucene based one could be recreated at application/multicore startup; creating a new index would just induce creating a new document in the multicore core (and in fact all CRUD operations could be handled that way) and we'd benefit from Solr autocommit feature & al, tackling your functional requirements reusing well-known capabilities & code.
          This also removes the "hack" loop used to find a core to work with when issuing a multicore/admin request (and the getDefaultCore call). Got a patch running for this now if this seems interesting.

          On configuring easily the data/index dir from multicore.xml, it seems we all agree that variables definitions should be able to allow just that; the non-extensible version of the feature (see previous comment)- where we dont allow the user to augment the environment but only expose 'solr.multicore.*'- did not trigger any comment yet, Otis/Hoss/Ryan what do you think of it ?

          Show
          Henri Biestro added a comment - - edited Otis, reading your requirements, I'd be considering using a Solr core (the "metacore") to handle an indexed version of multicore.xml; if you have a few thousands indices, it might be convenient to use queries in some occasions to select/retrieve/operate on one/many of them. The xml version of the multicore persistent file could be written at application/multicore shutdown and the Lucene based one could be recreated at application/multicore startup; creating a new index would just induce creating a new document in the multicore core (and in fact all CRUD operations could be handled that way) and we'd benefit from Solr autocommit feature & al, tackling your functional requirements reusing well-known capabilities & code. This also removes the "hack" loop used to find a core to work with when issuing a multicore/admin request (and the getDefaultCore call). Got a patch running for this now if this seems interesting. On configuring easily the data/index dir from multicore.xml, it seems we all agree that variables definitions should be able to allow just that; the non-extensible version of the feature (see previous comment)- where we dont allow the user to augment the environment but only expose 'solr.multicore.*'- did not trigger any comment yet, Otis/Hoss/Ryan what do you think of it ?
          Hide
          Ryan McKinley added a comment -

          Updated patch for /trunk and fixed the dispatcher problem.

          I think this is ready to commit – we can address the variable/config/data issue in a different issue or smaller patches.

          In reply to Hoss Feb 06

          a) yes – each core keeps the <requestDispatcher> settings from solrconfig.xml

          b) Creating a SolrRequestparsers is not super lightweight – it has 3 xpath queires on config, then builds a map and puts 5 things in it. That seems like a lot to add to every request rather then saving it at the beginning

          c) yes - we should open a seperate issue for variables

          d) "abortOnConfigurationError" should probably be renamed "abortOnStartupConfigurationError" – once the app is running, it does not (nor do i think it should) quit working if something loads incorrectly.

          re 1000s of cores

          Note that you don't have to use the xml multicore management stuff. If MultiCore support is enabled before the SolrRequestDispatcher init() method, it will use that directly. You can load cores from SQL or whereever and put them into the MultiCore registry. (i am doing just this in one project)

          Show
          Ryan McKinley added a comment - Updated patch for /trunk and fixed the dispatcher problem. I think this is ready to commit – we can address the variable/config/data issue in a different issue or smaller patches. In reply to Hoss Feb 06 a) yes – each core keeps the <requestDispatcher> settings from solrconfig.xml b) Creating a SolrRequestparsers is not super lightweight – it has 3 xpath queires on config, then builds a map and puts 5 things in it. That seems like a lot to add to every request rather then saving it at the beginning c) yes - we should open a seperate issue for variables d) "abortOnConfigurationError" should probably be renamed "abortOnStartupConfigurationError" – once the app is running, it does not (nor do i think it should) quit working if something loads incorrectly. re 1000s of cores Note that you don't have to use the xml multicore management stuff. If MultiCore support is enabled before the SolrRequestDispatcher init() method, it will use that directly. You can load cores from SQL or whereever and put them into the MultiCore registry. (i am doing just this in one project)
          Hide
          Henri Biestro added a comment - - edited

          A new version attempting to make it easier to derive from MultiCore (and associated classes - SolrDispatchFilter, MultiCoreHandler).
          This breaks the API (MultiCore.getInstance is removed) but a SolrMultiCore class is added to allow a smooth transition.
          SolrDispatchFilter logic has been reworked to reduce the number of 'return;' points and to be more lenient & let other filters handle more things when possible.
          One caveat, patch does not apply cleanly on MultiCore.java; the .rej is not too complex (now that I'm not in a rush), the beginning of the class def gets rejected.

          Show
          Henri Biestro added a comment - - edited A new version attempting to make it easier to derive from MultiCore (and associated classes - SolrDispatchFilter, MultiCoreHandler). This breaks the API (MultiCore.getInstance is removed) but a SolrMultiCore class is added to allow a smooth transition. SolrDispatchFilter logic has been reworked to reduce the number of 'return;' points and to be more lenient & let other filters handle more things when possible. One caveat, patch does not apply cleanly on MultiCore.java; the .rej is not too complex (now that I'm not in a rush), the beginning of the class def gets rejected.
          Hide
          Henri Biestro added a comment -

          Despite many tries, can't get this patch to apply without reject on MultiCore.java - still with the same block to apply manually. This latest version just introduces more comments and 2/3 more methods have been marked protected.
          Ryan, I guess that if you dont like this version, you should just commit yours which is in any case a step forward from the current trunk.

          Show
          Henri Biestro added a comment - Despite many tries, can't get this patch to apply without reject on MultiCore.java - still with the same block to apply manually. This latest version just introduces more comments and 2/3 more methods have been marked protected. Ryan, I guess that if you dont like this version, you should just commit yours which is in any case a step forward from the current trunk.
          Hide
          Yonik Seeley added a comment -

          > I think this is ready to commit - we can address the variable/config/data issue in a different issue or smaller patches.
          +1

          Show
          Yonik Seeley added a comment - > I think this is ready to commit - we can address the variable/config/data issue in a different issue or smaller patches. +1
          Hide
          Ryan McKinley added a comment -

          Henri – i just the previous patch. If you make another smaller one, i'll review and commit quickly.
          Thanks for all your work on this!

          Show
          Ryan McKinley added a comment - Henri – i just the previous patch. If you make another smaller one, i'll review and commit quickly. Thanks for all your work on this!
          Hide
          Henri Biestro added a comment -

          Updated patch to current trunk.

          Removes the static singleton from Multicore (& moves it to SolrMulticore), updated classes that depended upon it, makes Multicore/SolrDispatchFilter easily derivable.
          SolrDispatchFilter logic is more lenient and will let the filter chain handle urls that can't be dealt with.

          Produced on Ubuntu 7.10 by:
          svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > /tmp/solr-350.patch
          Successfully applied with no rejects with:
          patch -u -p 0 < /tmp/solr-350.patch

          Show
          Henri Biestro added a comment - Updated patch to current trunk. Removes the static singleton from Multicore (& moves it to SolrMulticore), updated classes that depended upon it, makes Multicore/SolrDispatchFilter easily derivable. SolrDispatchFilter logic is more lenient and will let the filter chain handle urls that can't be dealt with. Produced on Ubuntu 7.10 by: svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > /tmp/solr-350.patch Successfully applied with no rejects with: patch -u -p 0 < /tmp/solr-350.patch
          Hide
          Henri Biestro added a comment -

          simplified the MultiCore singleton handling (aka SolrMultiCore.getInstance is lazily loading) but kept SolrDispatchFilter/MultiCore/MultiCoreHandler derivable.

          Show
          Henri Biestro added a comment - simplified the MultiCore singleton handling (aka SolrMultiCore.getInstance is lazily loading) but kept SolrDispatchFilter/MultiCore/MultiCoreHandler derivable.
          Hide
          Shalin Shekhar Mangar added a comment - - edited

          Since the MultiCore#getRegistry does not exist anymore after the commit of this patch, a couple of JSPs do not work.

          I've changed those JSPs to use SolrMultiCore#getInstance method instead. This patch contains those changes.

          Show
          Shalin Shekhar Mangar added a comment - - edited Since the MultiCore#getRegistry does not exist anymore after the commit of this patch, a couple of JSPs do not work. I've changed those JSPs to use SolrMultiCore#getInstance method instead. This patch contains those changes.
          Hide
          Shalin Shekhar Mangar added a comment -

          I missed a couple of files when creating the last patch. This contains all the modified JSPs. The svn stat is as follows:

          M src\webapp\resources\index.jsp
          M src\webapp\resources\admin\logging.jsp
          M src\webapp\resources\admin\raw-schema.jsp
          M src\webapp\resources\admin\ping.jsp
          M src\webapp\resources\admin\threaddump.jsp
          M src\webapp\resources\admin\index.jsp

          Show
          Shalin Shekhar Mangar added a comment - I missed a couple of files when creating the last patch. This contains all the modified JSPs. The svn stat is as follows: M src\webapp\resources\index.jsp M src\webapp\resources\admin\logging.jsp M src\webapp\resources\admin\raw-schema.jsp M src\webapp\resources\admin\ping.jsp M src\webapp\resources\admin\threaddump.jsp M src\webapp\resources\admin\index.jsp
          Hide
          Walter Ferrara added a comment -

          It's been a while since I had a look at this patch, and things seems to have changed a bit meanwhile – but it looks strange that the only way to access the cores registry inside a solr istance relay on a deprecated class, org.apache.solr.core.SolrMultiCore. I noticed Henri mention that the SolrMultiCore singleton "is added to allow a smooth transition", but...
          If there is no another way to achieve the same result bypassing org.apache.solr.core.SolrMultiCore, that class should not be marked as deprecated. Or that deprecation has to be read as "in the final solr 1.3 just use SolrMultiCore and ignore the warning, but remember that in the next version, the 2.0, things will change"?

          Show
          Walter Ferrara added a comment - It's been a while since I had a look at this patch, and things seems to have changed a bit meanwhile – but it looks strange that the only way to access the cores registry inside a solr istance relay on a deprecated class, org.apache.solr.core.SolrMultiCore. I noticed Henri mention that the SolrMultiCore singleton "is added to allow a smooth transition", but... If there is no another way to achieve the same result bypassing org.apache.solr.core.SolrMultiCore, that class should not be marked as deprecated. Or that deprecation has to be read as "in the final solr 1.3 just use SolrMultiCore and ignore the warning, but remember that in the next version, the 2.0, things will change"?
          Hide
          Henri Biestro added a comment -

          Ryan: thanks for the commit.

          Shalin: thanks a lot for the JSP fix, my bad. Thinking of it, it might be possible to put the Multicore instance as a request attribute from the filter code & let JSP consume it this way rather than using SolrMultiCore. I'll look into it.

          Walter: yes, you are correct, things will most likely change in 2.0. We want MultiCore to be derivable and we dont want core core to consider MultiCore to be a singleton; however, we do not feel current needs require the class to be configurable (yet). May be o.a.s.servlet. would be/have been a better package for SolrMultiCore to make this easier. Sorry for the confusion.

          Show
          Henri Biestro added a comment - Ryan: thanks for the commit. Shalin: thanks a lot for the JSP fix, my bad. Thinking of it, it might be possible to put the Multicore instance as a request attribute from the filter code & let JSP consume it this way rather than using SolrMultiCore. I'll look into it. Walter: yes, you are correct, things will most likely change in 2.0. We want MultiCore to be derivable and we dont want core core to consider MultiCore to be a singleton; however, we do not feel current needs require the class to be configurable (yet). May be o.a.s.servlet. would be/have been a better package for SolrMultiCore to make this easier. Sorry for the confusion.
          Hide
          Yonik Seeley added a comment -

          Thanks Shalin, I just committed your JSP fixes (after converting the patch from UTF-16 to UTF-8

          Show
          Yonik Seeley added a comment - Thanks Shalin, I just committed your JSP fixes (after converting the patch from UTF-16 to UTF-8
          Hide
          Ryan McKinley added a comment -

          remove SolrMultiCore references from JSP

          Show
          Ryan McKinley added a comment - remove SolrMultiCore references from JSP
          Hide
          Erik Hatcher added a comment -

          The RemoveStatic patch looks good, Ryan. +1

          Show
          Erik Hatcher added a comment - The RemoveStatic patch looks good, Ryan. +1
          Hide
          Markus Mautner added a comment -

          MultiCore persistence is broken.

          multicore/@sharedLib gets written as multicore/@libDir, so loading the multicore configuration after saving will fail.

          Show
          Markus Mautner added a comment - MultiCore persistence is broken. multicore/@sharedLib gets written as multicore/@libDir, so loading the multicore configuration after saving will fail.
          Hide
          Ryan McKinley added a comment -

          thanks for finding this Markus!
          fixed in rev 673430

          Show
          Ryan McKinley added a comment - thanks for finding this Markus! fixed in rev 673430
          Hide
          Ryan McKinley added a comment -

          it looks like dataDir option was removed from CoreDescriptor. Was there a reason for this? Can multicore.xml manage the data directories?

          http://wiki.apache.org/solr/MultiCore#head-2696b6ae9766aa312580b5014f6c8f659a2c1bea

          I think we should return that configuration.

          Show
          Ryan McKinley added a comment - it looks like dataDir option was removed from CoreDescriptor. Was there a reason for this? Can multicore.xml manage the data directories? http://wiki.apache.org/solr/MultiCore#head-2696b6ae9766aa312580b5014f6c8f659a2c1bea I think we should return that configuration.
          Hide
          Henri Biestro added a comment -

          Looks like this was removed around 02/Feb/08 from one of your comments ; the dataDir can be set in solrconfig.xml so configuring it through multicore.xml was considered a dangerous feature.
          And I agree we should enhance the configuration behaviors.

          Since we are in the functional vicinity, the "2008-01-23 03:09 AM" version of the patch allowed (at least MulitCore.create(...)) the following:
          Make the instanceDir relative to the multicore instanceDir if not absolute
          Make the dataDir relative to the multicore dataDir if not absolute
          Just in case...

          Show
          Henri Biestro added a comment - Looks like this was removed around 02/Feb/08 from one of your comments ; the dataDir can be set in solrconfig.xml so configuring it through multicore.xml was considered a dangerous feature. And I agree we should enhance the configuration behaviors. Since we are in the functional vicinity, the "2008-01-23 03:09 AM" version of the patch allowed (at least MulitCore.create(...)) the following: Make the instanceDir relative to the multicore instanceDir if not absolute Make the dataDir relative to the multicore dataDir if not absolute Just in case...
          Hide
          Henri Biestro added a comment - - edited

          This patch (solr-350-properties.patch) implements 'properties' as specified by HossMan.

          see solr-646

          Show
          Henri Biestro added a comment - - edited This patch (solr-350-properties.patch) implements 'properties' as specified by HossMan . see solr-646

            People

            • Assignee:
              Ryan McKinley
              Reporter:
              Ryan McKinley
            • Votes:
              2 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development