Solr
  1. Solr
  2. SOLR-4914

Factor out core discovery and persistence logic

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 6.0
    • Fix Version/s: 4.5, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      Alan Woodward has done some work to refactor how core persistence works that we should work on going forward that I want to separate from a shorter-term tactical problem (See SOLR-4910).

      I'm attaching Alan's patch to this JIRA and we'll carry it forward separately from 4910.

      1. SOLR-4914.patch
        202 kB
        Alan Woodward
      2. SOLR-4914.patch
        200 kB
        Alan Woodward
      3. SOLR-4914.patch
        197 kB
        Alan Woodward
      4. SOLR-4914.patch
        183 kB
        Alan Woodward
      5. SOLR-4914.patch
        158 kB
        Alan Woodward
      6. SOLR-4914.patch
        125 kB
        Alan Woodward
      7. SOLR-4914.patch
        120 kB
        Alan Woodward
      8. SOLR-4914.patch
        38 kB
        Erick Erickson

        Issue Links

          Activity

          Hide
          Alan Woodward added a comment -

          Thanks for opening this, Erick.

          I think it makes sense to combine persistence and discovery, as they're basically two sides of the same coin. So pre Solr 4.3, we loaded cores from solr.xml and persisted them back there. Now we allow Solr to crawl the filesystem looking for core definition files and we persist these.

          I'm thinking the combined interface should look something like this:

          public interface CoreDiscoverer {
          
              public List<CoreDescriptor> discover();
          
              public void persist(CoreDescriptor cd);
              public void delete(CoreDescriptor cd);
          
          }
          

          This also allows us to make discovery and persistence completely pluggable. If I want to store my core definitions in an external DB, or a plain text file, or whatever, I implement a PlainTextCoreDiscoverer and drop it onto the classpath, and refer to it from solr.xml.

          Show
          Alan Woodward added a comment - Thanks for opening this, Erick. I think it makes sense to combine persistence and discovery, as they're basically two sides of the same coin. So pre Solr 4.3, we loaded cores from solr.xml and persisted them back there. Now we allow Solr to crawl the filesystem looking for core definition files and we persist these. I'm thinking the combined interface should look something like this: public interface CoreDiscoverer { public List<CoreDescriptor> discover(); public void persist(CoreDescriptor cd); public void delete(CoreDescriptor cd); } This also allows us to make discovery and persistence completely pluggable. If I want to store my core definitions in an external DB, or a plain text file, or whatever, I implement a PlainTextCoreDiscoverer and drop it onto the classpath, and refer to it from solr.xml.
          Hide
          Alan Woodward added a comment -

          Patch with my latest status. It won't compile yet, as I haven't updated all tests to reflect the new API. Will work on that next.

          This combines core discovery and persistence into the CoresListPersistor interface. All persistence logic is removed from CoreContainer and ConfigSolr and put into the two implementing classes of CorePropertiesPersistor and SolrXMLPersistor.

          CoreDescriptor is tidied up a bit, and made effectively immutable (would be nicer to make it really immutable, maybe with an ImmutableProperties class). The original pre-substitution parameters are stored as well as the values after substitution, which makes persistence a lot easier (you just read from the original values).

          Solr.xml persistence is also made a lot simpler, by just storing everything around the <cores/> tag as a flat string, and only updating the <core> tags. So you don't need to remember to add new solr.xml parameters to core persistence logic any more, and things like comments will be preserved.

          This is a pretty big patch, and there's still a fair amount to do, but I'd be grateful for some preliminary reviews. I think it simplifies the whole core discovery/persistence logic a lot.

          Show
          Alan Woodward added a comment - Patch with my latest status. It won't compile yet, as I haven't updated all tests to reflect the new API. Will work on that next. This combines core discovery and persistence into the CoresListPersistor interface. All persistence logic is removed from CoreContainer and ConfigSolr and put into the two implementing classes of CorePropertiesPersistor and SolrXMLPersistor. CoreDescriptor is tidied up a bit, and made effectively immutable (would be nicer to make it really immutable, maybe with an ImmutableProperties class). The original pre-substitution parameters are stored as well as the values after substitution, which makes persistence a lot easier (you just read from the original values). Solr.xml persistence is also made a lot simpler, by just storing everything around the <cores/> tag as a flat string, and only updating the <core> tags. So you don't need to remember to add new solr.xml parameters to core persistence logic any more, and things like comments will be preserved. This is a pretty big patch, and there's still a fair amount to do, but I'd be grateful for some preliminary reviews. I think it simplifies the whole core discovery/persistence logic a lot.
          Hide
          Alan Woodward added a comment -

          Same patch as before, updated to trunk to include all the SOLR-4910 changes. Most of which I'm nuking

          Show
          Alan Woodward added a comment - Same patch as before, updated to trunk to include all the SOLR-4910 changes. Most of which I'm nuking
          Hide
          Erick Erickson added a comment -

          Just make sure all the tests continue to pass

          Show
          Erick Erickson added a comment - Just make sure all the tests continue to pass
          Hide
          Alan Woodward added a comment -

          Latest status. There are still some failing tests, and I need to add some new generic persistence tests.

          Show
          Alan Woodward added a comment - Latest status. There are still some failing tests, and I need to add some new generic persistence tests.
          Hide
          Alan Woodward added a comment -

          Nearly there. ShardSplitTest is failing consistently, and I still need to work out what's going on there. There's also a nocommit in there concerning what to do when unloading a core discovered via a core.properties file - should we delete the file, or rename it? I think renaming it is probably the most user-friendly thing to do, as then you just have to rename it back again to make the core available next time you start up.

          Show
          Alan Woodward added a comment - Nearly there. ShardSplitTest is failing consistently, and I still need to work out what's going on there. There's also a nocommit in there concerning what to do when unloading a core discovered via a core.properties file - should we delete the file, or rename it? I think renaming it is probably the most user-friendly thing to do, as then you just have to rename it back again to make the core available next time you start up.
          Hide
          Erick Erickson added a comment -

          bq: I think renaming it is probably the most user-friendly thing to do

          +1 unless deleteIndex=true in which case nuke it all?

          Show
          Erick Erickson added a comment - bq: I think renaming it is probably the most user-friendly thing to do +1 unless deleteIndex=true in which case nuke it all?
          Hide
          Alan Woodward added a comment -

          I think if deleteIndex=true then it will already have been nuked.

          Show
          Alan Woodward added a comment - I think if deleteIndex=true then it will already have been nuked.
          Hide
          Alan Woodward added a comment -

          Final patch. All tests pass.

          There are a couple of bits here which could do with following up:
          1) CoreAdminHandler.handleCreateAction is doing a bunch of stuff that really ought to be moved inside CoreContainer.
          2) I'm a bit concerned by what will happen if somebody creates two cores using the same instance dir when in core discovery mode - the second core's core.properties file will overwrite the first. This is only an issue in core discovery mode, though, as storing core definitions in solr.xml can deal with this fine.

          Show
          Alan Woodward added a comment - Final patch. All tests pass. There are a couple of bits here which could do with following up: 1) CoreAdminHandler.handleCreateAction is doing a bunch of stuff that really ought to be moved inside CoreContainer. 2) I'm a bit concerned by what will happen if somebody creates two cores using the same instance dir when in core discovery mode - the second core's core.properties file will overwrite the first. This is only an issue in core discovery mode, though, as storing core definitions in solr.xml can deal with this fine.
          Hide
          Erick Erickson added a comment -

          A massive patch, so I haven't looked in great detail, but few comments:

          1> CoreAdminCreateDiscoverTest, there has been a change from testing that specific properties are preserved to just checking the number of properties. That test was written that way specifically because instanceDir was being preserved in the core.properties file. The way it is if we inadvertently preserved instanceDir but left out, say dataDir the test would pass. Can you put that back?

          2> TestCoreContainer.testPersist is removed, what replaces it?

          Next step: Removing persistence in 5.x!

          About your point 2, creating two cores with same instance dir. Last I knew there was a check at startup time to report this, but I agree it's a bit of a trap.

          Show
          Erick Erickson added a comment - A massive patch, so I haven't looked in great detail, but few comments: 1> CoreAdminCreateDiscoverTest, there has been a change from testing that specific properties are preserved to just checking the number of properties. That test was written that way specifically because instanceDir was being preserved in the core.properties file. The way it is if we inadvertently preserved instanceDir but left out, say dataDir the test would pass. Can you put that back? 2> TestCoreContainer.testPersist is removed, what replaces it? Next step: Removing persistence in 5.x! About your point 2, creating two cores with same instance dir. Last I knew there was a check at startup time to report this, but I agree it's a bit of a trap.
          Hide
          Alan Woodward added a comment -

          re CoreAdminCreateDiscoverTest, the lines immediately prior to the size check are checks for the four properties in question, so that's covered.

          re testPersist, that's covered by TestSolrXMLPersistence and TestCoreDiscovery. But some more tests here would be good, I agree...

          re cores sharing instance dir, startup time will be too late to detect the problem here. Creating a new core via the CoreAdminHandler will overwrite any existing core.properties file. Maybe the way to deal with this is to call locator.persist() before we call create(), and throw an exception from CorePropertiesLocator. Will try and work something up.

          I also think that this may remove the need to deprecate persistence - it's a lot more maintainable now.

          Show
          Alan Woodward added a comment - re CoreAdminCreateDiscoverTest, the lines immediately prior to the size check are checks for the four properties in question, so that's covered. re testPersist, that's covered by TestSolrXMLPersistence and TestCoreDiscovery. But some more tests here would be good, I agree... re cores sharing instance dir, startup time will be too late to detect the problem here. Creating a new core via the CoreAdminHandler will overwrite any existing core.properties file. Maybe the way to deal with this is to call locator.persist() before we call create(), and throw an exception from CorePropertiesLocator. Will try and work something up. I also think that this may remove the need to deprecate persistence - it's a lot more maintainable now.
          Hide
          Erick Erickson added a comment -

          bq: I also think that this may remove the need to deprecate persistence - it's a lot more maintainable now.

          I sure hope so, it was pretty ugly for a long time.

          But the deprecation also has another goal, moving all the core definitions out to discovery mode rather than this current hybrid. It's not clear to me if we need to continue persisting solr.xml if you never have to define cores there. But that's a discussion for another day I think.

          Thans for the effort here!

          Erick

          Show
          Erick Erickson added a comment - bq: I also think that this may remove the need to deprecate persistence - it's a lot more maintainable now. I sure hope so, it was pretty ugly for a long time. But the deprecation also has another goal, moving all the core definitions out to discovery mode rather than this current hybrid. It's not clear to me if we need to continue persisting solr.xml if you never have to define cores there. But that's a discussion for another day I think. Thans for the effort here! Erick
          Hide
          Alan Woodward added a comment -

          Adds a test that you can't add two cores that share an instanceDir if you're using the CorePropertiesLocator.

          re solr.xml persistence, the cores list is the only thing that changes, everything else is immutable. Which is why this JIRA doesn't bother trying to reconstruct complicated XML from the config, it just keeps a copy of the existing XML and replaces the cores list on persistence. Much easier

          Show
          Alan Woodward added a comment - Adds a test that you can't add two cores that share an instanceDir if you're using the CorePropertiesLocator. re solr.xml persistence, the cores list is the only thing that changes, everything else is immutable. Which is why this JIRA doesn't bother trying to reconstruct complicated XML from the config, it just keeps a copy of the existing XML and replaces the cores list on persistence. Much easier
          Hide
          Alan Woodward added a comment -

          I'm going to commit this tomorrow unless anybody objects, to trunk and 4.x. It's kind of a big change to get into 4.4 though.

          Show
          Alan Woodward added a comment - I'm going to commit this tomorrow unless anybody objects, to trunk and 4.x. It's kind of a big change to get into 4.4 though.
          Hide
          Erick Erickson added a comment -

          -1 on trying to put it into 4.4

          Show
          Erick Erickson added a comment - -1 on trying to put it into 4.4
          Hide
          Alan Woodward added a comment -

          Actually I think SOLR-5029 needs to be fixed before I commit this.

          Show
          Alan Woodward added a comment - Actually I think SOLR-5029 needs to be fixed before I commit this.
          Hide
          Alan Woodward added a comment -

          New patch, taking into account SOLR-5029 as well. All tests pass, committing to trunk and 4x (but not 4.4!) shortly.

          Show
          Alan Woodward added a comment - New patch, taking into account SOLR-5029 as well. All tests pass, committing to trunk and 4x (but not 4.4!) shortly.
          Hide
          ASF subversion and git services added a comment -

          Commit 1502276 from Alan Woodward
          [ https://svn.apache.org/r1502276 ]

          SOLR-4914: Factor out core discovery and persistence logic

          Show
          ASF subversion and git services added a comment - Commit 1502276 from Alan Woodward [ https://svn.apache.org/r1502276 ] SOLR-4914 : Factor out core discovery and persistence logic
          Hide
          ASF subversion and git services added a comment -

          Commit 1502280 from Alan Woodward
          [ https://svn.apache.org/r1502280 ]

          SOLR-4914: Factor out core discovery and persistence logic

          Show
          ASF subversion and git services added a comment - Commit 1502280 from Alan Woodward [ https://svn.apache.org/r1502280 ] SOLR-4914 : Factor out core discovery and persistence logic
          Hide
          ASF subversion and git services added a comment -

          Commit 1502468 from Alan Woodward
          [ https://svn.apache.org/r1502468 ]

          SOLR-4914: Close OutputStreamWriter properly, use System.getProperty("line.separator") instead of \n

          Fixes Windows test failures.

          Show
          ASF subversion and git services added a comment - Commit 1502468 from Alan Woodward [ https://svn.apache.org/r1502468 ] SOLR-4914 : Close OutputStreamWriter properly, use System.getProperty("line.separator") instead of \n Fixes Windows test failures.
          Hide
          ASF subversion and git services added a comment -

          Commit 1502469 from Alan Woodward
          [ https://svn.apache.org/r1502469 ]

          SOLR-4914: Close OutputStreamWriter, use platform-independent newlines in tests

          Show
          ASF subversion and git services added a comment - Commit 1502469 from Alan Woodward [ https://svn.apache.org/r1502469 ] SOLR-4914 : Close OutputStreamWriter, use platform-independent newlines in tests
          Hide
          ASF subversion and git services added a comment -

          Commit 1502481 from Alan Woodward
          [ https://svn.apache.org/r1502481 ]

          SOLR-4914: Use Properties.store(OutputStream, String) for back compatibility

          Show
          ASF subversion and git services added a comment - Commit 1502481 from Alan Woodward [ https://svn.apache.org/r1502481 ] SOLR-4914 : Use Properties.store(OutputStream, String) for back compatibility
          Hide
          ASF subversion and git services added a comment -

          Commit 1502483 from Alan Woodward
          [ https://svn.apache.org/r1502483 ]

          SOLR-4914: Use Properties.store(OutputStream, String) for back compatibility

          Show
          ASF subversion and git services added a comment - Commit 1502483 from Alan Woodward [ https://svn.apache.org/r1502483 ] SOLR-4914 : Use Properties.store(OutputStream, String) for back compatibility
          Hide
          ASF subversion and git services added a comment -

          Commit 1502507 from Alan Woodward
          [ https://svn.apache.org/r1502507 ]

          SOLR-4914: Close input streams as well

          Show
          ASF subversion and git services added a comment - Commit 1502507 from Alan Woodward [ https://svn.apache.org/r1502507 ] SOLR-4914 : Close input streams as well
          Hide
          ASF subversion and git services added a comment -

          Commit 1502508 from Alan Woodward
          [ https://svn.apache.org/r1502508 ]

          SOLR-4914: Close input streams as well

          Show
          ASF subversion and git services added a comment - Commit 1502508 from Alan Woodward [ https://svn.apache.org/r1502508 ] SOLR-4914 : Close input streams as well

            People

            • Assignee:
              Alan Woodward
              Reporter:
              Erick Erickson
            • Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development