Solr
  1. Solr
  2. SOLR-1167

Support module xml config files using XInclude

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.4
    • Fix Version/s: 1.4
    • Component/s: None
    • Labels:
      None

      Description

      Current configuration files (schema and solrconfig) are monolithic which can make maintenance and reuse more difficult that it needs to be. The XML standards include a feature to include content from external files. This is described at http://www.w3.org/TR/xinclude/

      This feature is to add support for XInclude features for XML configuration files.

      1. SOLR-1167.patch
        23 kB
        Grant Ingersoll
      2. SOLR-1167.patch
        3 kB
        Bryan Talbot
      3. SOLR-1167.patch
        5 kB
        Bryan Talbot
      4. SOLR-1167.patch
        5 kB
        Bryan Talbot
      5. SOLR-1167.patch
        2 kB
        Bryan Talbot

        Issue Links

          Activity

          Hide
          Peter Wolanin added a comment -

          I think you posted a sample snippet for solrconfig to the list - can you report here and possibly include in the patch a change to the sample schema or solrconfig that would demonstrate this feature?

          Show
          Peter Wolanin added a comment - I think you posted a sample snippet for solrconfig to the list - can you report here and possibly include in the patch a change to the sample schema or solrconfig that would demonstrate this feature?
          Hide
          Bryan Talbot added a comment -

          Support for xinclude will allow a few options to include xml (or non-xml) content from an external file. The external file can be loaded from the file system or from any HTTP resource.

          Here are some examples:

          <!-- include solrconfig_master.xml from the file system and generate an error if the file can't be found -->
          <xi:include href="solr/conf/solrconfig_master.xml" xmlns:xi="http://www.w3.org/2001/XInclude"/>

          <!-- include solrconfig_master.xml from an HTTP URL and ignore it if it's missing -->
          <xi:include href="http://localhost:8983/solr/admin/file/?file=replication_master.xml"
          xmlns:xi="http://www.w3.org/2001/XInclude">
          </xi:fallback/>
          </xi:include>

          <!-- include solrconfig_master.xml from the filesystem. If it cannot be found, attempt
          to include solrconfig_slave.xml from the filesystem. If neither file can be found, don't
          generate an error.
          -->
          <xi:include href="solr/conf/solrconfig_master.xml" xmlns:xi="http://www.w3.org/2001/XInclude">
          <xi:fallback>
          <xi:include href="solr/conf/solrconfig_slave.xml" xmlns:xi="http://www.w3.org/2001/XInclude">
          <xi:fallback/>
          </xi:include>
          </xi:fallback>
          </xi:include>

          <!-- attempt to include an optional file containing index options. If the file can't be found, fall back to some
          default values.
          -->
          <xi:include href="solr/conf/solrconfig_indexOptions.xml" xmlns:xi="http://www.w3.org/2001/XInclude">
          <xi:fallback>
          <useCompoundFile>false</useCompoundFile>
          <ramBufferSizeMB>32</ramBufferSizeMB>
          <mergeFactor>10</mergeFactor>
          <maxMergeDocs>2147483647</maxMergeDocs>
          <maxFieldLength>10000</maxFieldLength>
          </xi:fallback>
          </xi:include>

          I'll update the patch to include solrconfig_master.xml and solrconfig_slave.xml files if they are present in the solr/conf directory. The inclusions are currently commented out and the resulting configuration is equivalent to the existing sample config.

          Show
          Bryan Talbot added a comment - Support for xinclude will allow a few options to include xml (or non-xml) content from an external file. The external file can be loaded from the file system or from any HTTP resource. Here are some examples: <!-- include solrconfig_master.xml from the file system and generate an error if the file can't be found --> <xi:include href="solr/conf/solrconfig_master.xml" xmlns:xi="http://www.w3.org/2001/XInclude"/> <!-- include solrconfig_master.xml from an HTTP URL and ignore it if it's missing --> <xi:include href="http://localhost:8983/solr/admin/file/?file=replication_master.xml" xmlns:xi="http://www.w3.org/2001/XInclude"> </xi:fallback/> </xi:include> <!-- include solrconfig_master.xml from the filesystem. If it cannot be found, attempt to include solrconfig_slave.xml from the filesystem. If neither file can be found, don't generate an error. --> <xi:include href="solr/conf/solrconfig_master.xml" xmlns:xi="http://www.w3.org/2001/XInclude"> <xi:fallback> <xi:include href="solr/conf/solrconfig_slave.xml" xmlns:xi="http://www.w3.org/2001/XInclude"> <xi:fallback/> </xi:include> </xi:fallback> </xi:include> <!-- attempt to include an optional file containing index options. If the file can't be found, fall back to some default values. --> <xi:include href="solr/conf/solrconfig_indexOptions.xml" xmlns:xi="http://www.w3.org/2001/XInclude"> <xi:fallback> <useCompoundFile>false</useCompoundFile> <ramBufferSizeMB>32</ramBufferSizeMB> <mergeFactor>10</mergeFactor> <maxMergeDocs>2147483647</maxMergeDocs> <maxFieldLength>10000</maxFieldLength> </xi:fallback> </xi:include> I'll update the patch to include solrconfig_master.xml and solrconfig_slave.xml files if they are present in the solr/conf directory. The inclusions are currently commented out and the resulting configuration is equivalent to the existing sample config.
          Hide
          Bryan Talbot added a comment -

          Include changes to example solrconfig.xml to include master and slave settings from an external xml file. The master file includes replication and dataimport handler definitions.

          The inclusions are commented out currently.

          Show
          Bryan Talbot added a comment - Include changes to example solrconfig.xml to include master and slave settings from an external xml file. The master file includes replication and dataimport handler definitions. The inclusions are commented out currently.
          Hide
          Jianhan added a comment -

          It is good that we can now put certain configuration to a separate file with this change, but this does not seem to completely solve the problem raised in SOLR-1154, and we still have to use the hack to disable master/slave if we use the same package for master and slave. Is there a way to conditionally include one file or another (rather than include the other if one is not available)?

          Show
          Jianhan added a comment - It is good that we can now put certain configuration to a separate file with this change, but this does not seem to completely solve the problem raised in SOLR-1154 , and we still have to use the hack to disable master/slave if we use the same package for master and slave. Is there a way to conditionally include one file or another (rather than include the other if one is not available)?
          Hide
          Bryan Talbot added a comment -

          It can't conditionally include content, no. That can be achieved in startup scripts easy enough though: if the "-master" flag is given then copy the correct file into place and include it, if the "-slave" flag is given then copy the slave file into place, etc.

          I believe we'd need to write code to handle conditional inclusion. The XInclude stuff works at the XML parser level which knows nothing of system and environment variables of course.

          -Bryan

          Show
          Bryan Talbot added a comment - It can't conditionally include content, no. That can be achieved in startup scripts easy enough though: if the "-master" flag is given then copy the correct file into place and include it, if the "-slave" flag is given then copy the slave file into place, etc. I believe we'd need to write code to handle conditional inclusion. The XInclude stuff works at the XML parser level which knows nothing of system and environment variables of course. -Bryan
          Hide
          Hoss Man added a comment -

          I like the simplicity of the patch, and the use of an existing standard. If I'd realized that xinclude was supported by DocumentBuilderFactory i would have tried this a long time ago ... my only concern is how well supported this feature is in the various DBF impls out there.

          BTW: the redundent snippets in DataImporter.java and COnfig.java make me think we need to refactor a helper function for this somewhere in utils, but that's not a huge issue.

          Is there a way to conditionally include one file or another (rather than include the other if one is not available)?

          Keep in mind that the problem can be inverted: instead of having a common solrconfig.xml file that conditionally includes master-snippet.xml or slave-snippet.xml based on some property, you can have unique solrconfig.xml files for the master and slave (in separate solr home dirs) which only contain the unique options and include the common chunks from other files .. then you can use solr.solr.home to drive which set of configs to use.

          Show
          Hoss Man added a comment - I like the simplicity of the patch, and the use of an existing standard. If I'd realized that xinclude was supported by DocumentBuilderFactory i would have tried this a long time ago ... my only concern is how well supported this feature is in the various DBF impls out there. BTW: the redundent snippets in DataImporter.java and COnfig.java make me think we need to refactor a helper function for this somewhere in utils, but that's not a huge issue. Is there a way to conditionally include one file or another (rather than include the other if one is not available)? Keep in mind that the problem can be inverted: instead of having a common solrconfig.xml file that conditionally includes master-snippet.xml or slave-snippet.xml based on some property, you can have unique solrconfig.xml files for the master and slave (in separate solr home dirs) which only contain the unique options and include the common chunks from other files .. then you can use solr.solr.home to drive which set of configs to use.
          Hide
          Bryan Talbot added a comment -

          What needs to happen to get this into 1.4 before the code freeze?

          Show
          Bryan Talbot added a comment - What needs to happen to get this into 1.4 before the code freeze?
          Hide
          Mark Miller added a comment -

          Could we:

          if setxincludeaware throws unsupported, load the dbf without that setting, and if contains an include, throw an exception saying its unsupported with the current impl your using? I think its fairly well supported in anything even semi recent.

          Show
          Mark Miller added a comment - Could we: if setxincludeaware throws unsupported, load the dbf without that setting, and if contains an include, throw an exception saying its unsupported with the current impl your using? I think its fairly well supported in anything even semi recent.
          Hide
          Bryan Talbot added a comment -

          Allowing the creation of the DBF when setXIncludeAware throws an exception is easy enough. Detecting that there are xinclude elements present seems much harder since that is all handled by the XML parser. How about if a warning log message is generated if setXIncludeAware can't be set?

          Show
          Bryan Talbot added a comment - Allowing the creation of the DBF when setXIncludeAware throws an exception is easy enough. Detecting that there are xinclude elements present seems much harder since that is all handled by the XML parser. How about if a warning log message is generated if setXIncludeAware can't be set?
          Hide
          Bryan Talbot added a comment -

          Patch updated to apply cleanly to more recent trunk (r813098). It also catches an exception if the setXIncludeAware(true) method is unsupported and allows the DBF to still be created. A warning log is generated in this case: "XML parser doesn't support XInclude option"

          Show
          Bryan Talbot added a comment - Patch updated to apply cleanly to more recent trunk (r813098). It also catches an exception if the setXIncludeAware(true) method is unsupported and allows the DBF to still be created. A warning log is generated in this case: "XML parser doesn't support XInclude option"
          Hide
          Henri Biestro added a comment -

          Just a thought; wouldn'it be possible to use system entities (as in SOLR-712 / SOLR-646) to have variables resolution in entities and use those in the xi:include href ?

          <!DOCTYPE schema [
          <!ENTITY myevar SYSTEM "solr:${myvar}">
          ]>
          ...
          <xi:include href="&myevar;".../>
          ...
          

          This would allow include of files using variables using standards without reverting the inclusion logic.

          In any case, thanks Bryan for pushing this.

          Show
          Henri Biestro added a comment - Just a thought; wouldn'it be possible to use system entities (as in SOLR-712 / SOLR-646 ) to have variables resolution in entities and use those in the xi:include href ? <!DOCTYPE schema [ <!ENTITY myevar SYSTEM "solr:${myvar}" > ]> ... <xi:include href= "&myevar;" .../> ... This would allow include of files using variables using standards without reverting the inclusion logic. In any case, thanks Bryan for pushing this.
          Hide
          Bryan Talbot added a comment -

          The patch is for the trunk, currently 1.4

          Show
          Bryan Talbot added a comment - The patch is for the trunk, currently 1.4
          Hide
          Grant Ingersoll added a comment -

          Patch looks fine, just not sure that we should add this into the example since it just adds more files to an already. Perhaps just some comments and some writeup on the wiki in the SolrConfig page?

          Show
          Grant Ingersoll added a comment - Patch looks fine, just not sure that we should add this into the example since it just adds more files to an already. Perhaps just some comments and some writeup on the wiki in the SolrConfig page?
          Hide
          Bryan Talbot added a comment -

          I agree that the examples should be simple (yet functional) and this does complicate them. I added changes to the configuration because a previous commented requested it. I think changing the sample solrconfig to mention support for XInclude and document examples on the wiki is better for new users too.

          Should I change the patch to remove the changes to the sample solrconfig and only include a comment about XInclude support?

          Show
          Bryan Talbot added a comment - I agree that the examples should be simple (yet functional) and this does complicate them. I added changes to the configuration because a previous commented requested it. I think changing the sample solrconfig to mention support for XInclude and document examples on the wiki is better for new users too. Should I change the patch to remove the changes to the sample solrconfig and only include a comment about XInclude support?
          Hide
          Grant Ingersoll added a comment -

          That would be my vote. If you can do that, I'll take a look today.

          Show
          Grant Ingersoll added a comment - That would be my vote. If you can do that, I'll take a look today.
          Hide
          Bryan Talbot added a comment -

          Updated patch to only add a comment to solrconfig.xml which refers to the wiki SolrConfigXml page for configuration options. The wiki can then be updated to include details about using XInclude once it's available.

          Show
          Bryan Talbot added a comment - Updated patch to only add a comment to solrconfig.xml which refers to the wiki SolrConfigXml page for configuration options. The wiki can then be updated to include details about using XInclude once it's available.
          Hide
          Hoss Man added a comment -

          Detecting that there are xinclude elements present seems much harder since that is all handled by the XML parser. How about if a warning log message is generated if setXIncludeAware can't be set?

          It occured to me that setNamespaceAware(true) has no documented failure case - every DBF implementation is suppose to support it. So as long as the DocumentBuilder is namespace aware, then if setXIncludeAware(true) fails, we could (in theory) inspect the resulting DOM Document to see if there are any "

          {http://www.w3.org/2001/XInclude}

          include" nodes in the document – if there are then the config was expecting XInclude support and we can fail with a hard error, if not then they didn't care about XInclude support anyway, so no need to log a warning if it's not supported

          (I'm just throwing this out there as an idea – if someone wants to try implementing it then great, but i don't think it should be a roadblock for the existing patch, because honestly: as long as we document that XInclude depends on XML parser support, then if people add XIncludes to their configs, but don't test to verify that it's working in environment they're on their own and i won't feel bad if something fails silently)

          Show
          Hoss Man added a comment - Detecting that there are xinclude elements present seems much harder since that is all handled by the XML parser. How about if a warning log message is generated if setXIncludeAware can't be set? It occured to me that setNamespaceAware(true) has no documented failure case - every DBF implementation is suppose to support it. So as long as the DocumentBuilder is namespace aware, then if setXIncludeAware(true) fails, we could (in theory) inspect the resulting DOM Document to see if there are any " {http://www.w3.org/2001/XInclude} include" nodes in the document – if there are then the config was expecting XInclude support and we can fail with a hard error, if not then they didn't care about XInclude support anyway, so no need to log a warning if it's not supported (I'm just throwing this out there as an idea – if someone wants to try implementing it then great, but i don't think it should be a roadblock for the existing patch, because honestly: as long as we document that XInclude depends on XML parser support, then if people add XIncludes to their configs, but don't test to verify that it's working in environment they're on their own and i won't feel bad if something fails silently)
          Hide
          Grant Ingersoll added a comment -

          Added a test for this. I'm not sure if it is the right way to go. I don't want the test to fail if the person running it doesn't have a DocumentBuilder that supports it, b/c that wouldn't fail in the live case either.

          Thoughts? I'd like to close this one out.

          Show
          Grant Ingersoll added a comment - Added a test for this. I'm not sure if it is the right way to go. I don't want the test to fail if the person running it doesn't have a DocumentBuilder that supports it, b/c that wouldn't fail in the live case either. Thoughts? I'd like to close this one out.
          Hide
          Shalin Shekhar Mangar added a comment -

          Bryan, I'm not sure if you have followed recent developments. Now it is possible to add a enable attribute to any solr plugin and value of the enable attribute can be driven from an external properties file. With this you don't need separate solrconfig.xml files for master/slave. This is the approach we are using. Was that the original use-case behind this feature?

          Show
          Shalin Shekhar Mangar added a comment - Bryan, I'm not sure if you have followed recent developments. Now it is possible to add a enable attribute to any solr plugin and value of the enable attribute can be driven from an external properties file. With this you don't need separate solrconfig.xml files for master/slave. This is the approach we are using. Was that the original use-case behind this feature?
          Hide
          Grant Ingersoll added a comment -

          I still think it is useful to be able to use XInclude.

          Show
          Grant Ingersoll added a comment - I still think it is useful to be able to use XInclude.
          Hide
          Grant Ingersoll added a comment -

          Committed revision 820652.

          Bryan, can you update the Wiki?

          Show
          Grant Ingersoll added a comment - Committed revision 820652. Bryan, can you update the Wiki?
          Hide
          Grant Ingersoll added a comment -

          Bulk close for Solr 1.4

          Show
          Grant Ingersoll added a comment - Bulk close for Solr 1.4
          Hide
          Peter Karich added a comment -

          @Shalin Shekhar Mangar: how can I use the proposed attribute feature to be used for master+slave configuration? Do you have a code snippet?

          Show
          Peter Karich added a comment - @Shalin Shekhar Mangar: how can I use the proposed attribute feature to be used for master+slave configuration? Do you have a code snippet?

            People

            • Assignee:
              Grant Ingersoll
              Reporter:
              Bryan Talbot
            • Votes:
              2 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development