Solr
  1. Solr
  2. SOLR-1668

Declarative configuration meta-data for Solr plugins

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 1.4
    • Fix Version/s: 4.9, 5.0
    • Component/s: Schema and Analysis
    • Labels:
      None

      Description

      The idea here is for plugins in Solr to carry more meta data over their configuration. This can be very useful for building tools around Solr where this meta data can be used to assist users in configuring solr. One common mechanism to provide this meta data is by using standard Java Beans for the different configuration constructs where the properties define the configurable attributes and annotations are used to provide extra information about them.

      1. commons-beanutils-1.8.2.jar
        226 kB
        Uri Boness
      2. SOLR-1668.patch
        46 kB
        Uri Boness
      3. SOLR-1668.patch
        36 kB
        Uri Boness

        Activity

        Hide
        Uri Boness added a comment -

        This patch provides Java Bean configuration for all MapInitializedPlugins. To showcase this functionality, I changed the TokenizerFactory to implement the MapInitializedPlugin interface and changed the PatternTokenizerFactory to use the new Java Bean configuration. This implementation depends on the commons-beanutils library which should be added to the lib directory.

        Show
        Uri Boness added a comment - This patch provides Java Bean configuration for all MapInitializedPlugins. To showcase this functionality, I changed the TokenizerFactory to implement the MapInitializedPlugin interface and changed the PatternTokenizerFactory to use the new Java Bean configuration. This implementation depends on the commons-beanutils library which should be added to the lib directory.
        Hide
        Erik Hatcher added a comment -

        That's getting warmer, Uri! Good stuff.

        What I'd ideally like to see, and of course this comes from my view that Ant has the answer to all Java coding roadblocks, is the plugins themselves, say MyCustomTokenizer(Factory) having metadata about its parameters through good ol' setters (or ctor introspection, but setters allow more flexibility). For example, our great friend the <java> task, documented here - http://ant.apache.org/manual/CoreTasks/java.html and it's underlying source code - http://svn.apache.org/repos/asf/ant/core/trunk/src/main/org/apache/tools/ant/taskdefs/Java.java

        Notice setters/creators/adders (setX, createY, addZ named methods). Metadata about the type of the parameter is natural... so for example MyCustomTokenFilter that adjusted token text could have a setKeepOriginal(boolean b) method.

        With this sort of magic, we could generate not only documentation but admin GUIs that could introspect arbitrary plugins and create checkboxes, or drop-down pickers, or file picker, etc.

        So, again, getting warmer, but it ain't Ant yet

        Show
        Erik Hatcher added a comment - That's getting warmer, Uri! Good stuff. What I'd ideally like to see, and of course this comes from my view that Ant has the answer to all Java coding roadblocks, is the plugins themselves, say MyCustomTokenizer(Factory) having metadata about its parameters through good ol' setters (or ctor introspection, but setters allow more flexibility). For example, our great friend the <java> task, documented here - http://ant.apache.org/manual/CoreTasks/java.html and it's underlying source code - http://svn.apache.org/repos/asf/ant/core/trunk/src/main/org/apache/tools/ant/taskdefs/Java.java Notice setters/creators/adders (setX, createY, addZ named methods). Metadata about the type of the parameter is natural... so for example MyCustomTokenFilter that adjusted token text could have a setKeepOriginal(boolean b) method. With this sort of magic, we could generate not only documentation but admin GUIs that could introspect arbitrary plugins and create checkboxes, or drop-down pickers, or file picker, etc. So, again, getting warmer, but it ain't Ant yet
        Hide
        Uri Boness added a comment -

        Thanks! Well... no it's not Ant yet or Spring, but it's a start that can already help with Tokenizers & Filters. The current patch is actually based on setters but adding annotations on top of them can add even more meta data. For example, marking a property as required or associating a different configuration name perhaps to differentiate user friendly naming from code friendly naming (How does Ant deal with these stuff?).

        Show
        Uri Boness added a comment - Thanks! Well... no it's not Ant yet or Spring, but it's a start that can already help with Tokenizers & Filters. The current patch is actually based on setters but adding annotations on top of them can add even more meta data. For example, marking a property as required or associating a different configuration name perhaps to differentiate user friendly naming from code friendly naming (How does Ant deal with these stuff?).
        Hide
        Erik Hatcher added a comment -

        Yeah, for sure annotations make sense to leverage here for part of it.

        As for user vs. code friendly - I'm of the opinion that they can be one and the same basically. setStopWordFile(SolrFile f) has a lot of metadata in it. Why not simply File? I just figured we might want to abstract that one step from file system directness.

        @Required makes sense for mandatory ones, indeed. This is (with my dated knowledge of Ant internals) where Ant does the runtime kinda validation in the execute() method for a Task. Maybe they've gone a step further with annotations now?

        And having a mechanism to override the parameter name or key, sure - but as much should be induced from the method signature as possible. Making it a rich descriptive interface.

        Show
        Erik Hatcher added a comment - Yeah, for sure annotations make sense to leverage here for part of it. As for user vs. code friendly - I'm of the opinion that they can be one and the same basically. setStopWordFile(SolrFile f) has a lot of metadata in it. Why not simply File? I just figured we might want to abstract that one step from file system directness. @Required makes sense for mandatory ones, indeed. This is (with my dated knowledge of Ant internals) where Ant does the runtime kinda validation in the execute() method for a Task. Maybe they've gone a step further with annotations now? And having a mechanism to override the parameter name or key, sure - but as much should be induced from the method signature as possible. Making it a rich descriptive interface.
        Hide
        Erik Hatcher added a comment - - edited

        Also note that Ant's configuration mechanism isn't just with setters. A <java> task for example can take any number of <sysproperty> sub-elements, and they get "injected" via addSysproperty(Environment.Variable sysp).

        Spring isn't even that clever, is it? (probably is and I'm just making myself look foolish, huh?)

        Show
        Erik Hatcher added a comment - - edited Also note that Ant's configuration mechanism isn't just with setters. A <java> task for example can take any number of <sysproperty> sub-elements, and they get "injected" via addSysproperty(Environment.Variable sysp). Spring isn't even that clever, is it? (probably is and I'm just making myself look foolish, huh?)
        Hide
        Erik Hatcher added a comment - - edited

        In looking at your patch in more detail, we're actually not far from agreeing. It's the specifying of the converter class in the annotation that I don't like. It can be more implicit than that, like "magic". public void setPattern(Pattern pattern) - perfect, we agree 100% on that!

        Sure, there's always some String -> Object converter in the process, as this config will come from strings almost always. But no need to clutter the plugin itself with converters. Make sense?

        Show
        Erik Hatcher added a comment - - edited In looking at your patch in more detail, we're actually not far from agreeing. It's the specifying of the converter class in the annotation that I don't like. It can be more implicit than that, like "magic". public void setPattern(Pattern pattern) - perfect, we agree 100% on that! Sure, there's always some String -> Object converter in the process, as this config will come from strings almost always. But no need to clutter the plugin itself with converters. Make sense?
        Hide
        Erik Hatcher added a comment -

        Another thought here is to make these configurations later-bound, if that makes sense. Suppose I want something like: stopWordFile="$

        {company.code}

        _stopwords.txt"

        Maybe a bit of a stretch of an example, with the fabricated idea that you want to have a single Solr configuration (schema, etc) and be able to launch multiple solr instances (can you do this with per-core params too? maybe so) that use the same config, but use a different stop word list.

        We'd have setStopWordList(SolrFile f), and we'd only that setter after the system properties were in the mix.

        Maybe this is neither here nor there as far as this issue is concerned, as the property substitution is at a previous step no matter what, just wanted to make sure this use case is kept in mind too.

        Show
        Erik Hatcher added a comment - Another thought here is to make these configurations later-bound, if that makes sense. Suppose I want something like: stopWordFile="$ {company.code} _stopwords.txt" Maybe a bit of a stretch of an example, with the fabricated idea that you want to have a single Solr configuration (schema, etc) and be able to launch multiple solr instances (can you do this with per-core params too? maybe so) that use the same config, but use a different stop word list. We'd have setStopWordList(SolrFile f), and we'd only that setter after the system properties were in the mix. Maybe this is neither here nor there as far as this issue is concerned, as the property substitution is at a previous step no matter what, just wanted to make sure this use case is kept in mind too.
        Hide
        Noble Paul added a comment -

        This can be very useful for building tools around Solr where this meta data can be used to assist users in configuring solr.

        is there anyone building it?

        What else is the value add other than exposing metadata to tools?

        public void setPattern(Pattern pattern) - perfect, we agree 100% on that! public void setPattern(Pattern pattern) - perfect, we agree 100% on that!

        Why do we need this magic in String-> Object conversion at all?

        Show
        Noble Paul added a comment - This can be very useful for building tools around Solr where this meta data can be used to assist users in configuring solr. is there anyone building it? What else is the value add other than exposing metadata to tools? public void setPattern(Pattern pattern) - perfect, we agree 100% on that! public void setPattern(Pattern pattern) - perfect, we agree 100% on that! Why do we need this magic in String-> Object conversion at all?
        Hide
        Uri Boness added a comment -

        @Erik

        Also note that Ant's configuration mechanism isn't just with setters. A <java> task for example can take any number of <sysproperty> sub-elements, and they get "injected" via addSysproperty(Environment.Variable sysp).

        System properties can be supported in 2 ways:
        1. On the configuration level using an expression language (a la Spring... yes.. Spring supports it ). This means that in the schema you'll be able to configure properties like: stopWordFile="$

        {conf.dir}

        /stopwords.txt". the "conf.dir" parameter can be replaced either from system properties, properties file, or other source. Eventually these properties
        2. Using another annotation (say, @SystemProperty) which indicates the value should first be taken from the system properties and then converted to the required data type

        It's the specifying of the converter class in the annotation that I don't like. It can be more implicit than that, like "magic"

        The @Converter annotation is mainly aimed for user extensions. Indeed all the out-of-the-box plugins don't need to have it as default converters can be pre-registered to handle all the data types we need at the moment. For users who want to provide their own plugins, we need to provide them a simple mechanism to register converters and I found the @Converter annotation to be the simplest one.

        We'd have setStopWordList(SolrFile f), and we'd only that setter after the system properties were in the mix.

        As you said, I believe once we have system properties supported this will be a no brainer and indeed I believe this belongs to an earlier "properties substitution" phase (as mentioned above).

        @Noble

        is there anyone building it?

        Oh yes , but beyond that, this will open up opportunities to develop plugins to IDE's/TextEditors for Solr... even just for better support in writing the schema files with auto-completion, validation, etc...

        Why do we need this magic in String-> Object conversion at all?

        Well, my obvious response is because of the nature of Solr configuration which is text based while at runtime you're dealing with other data types. Of course you can just create String setters and do the conversion yourself, but why do that if you can have done automatically and keep your classes clean. Just to be clear, the magic is not really "magic" we can be very clear about what converters are supported out of the box and (as I mentioned above) with the @Converter annotation users can be more explicit in how they want the conversion to take place. Bottom line, in the end of the day you want to be able to focus and write the plugins as POJO's using properties of the correct data types and focus on the plugin's logic rather than also focusing on configuration logic.

        Show
        Uri Boness added a comment - @Erik Also note that Ant's configuration mechanism isn't just with setters. A <java> task for example can take any number of <sysproperty> sub-elements, and they get "injected" via addSysproperty(Environment.Variable sysp). System properties can be supported in 2 ways: 1. On the configuration level using an expression language (a la Spring... yes.. Spring supports it ). This means that in the schema you'll be able to configure properties like: stopWordFile="$ {conf.dir} /stopwords.txt". the "conf.dir" parameter can be replaced either from system properties, properties file, or other source. Eventually these properties 2. Using another annotation (say, @SystemProperty) which indicates the value should first be taken from the system properties and then converted to the required data type It's the specifying of the converter class in the annotation that I don't like. It can be more implicit than that, like "magic" The @Converter annotation is mainly aimed for user extensions. Indeed all the out-of-the-box plugins don't need to have it as default converters can be pre-registered to handle all the data types we need at the moment. For users who want to provide their own plugins, we need to provide them a simple mechanism to register converters and I found the @Converter annotation to be the simplest one. We'd have setStopWordList(SolrFile f), and we'd only that setter after the system properties were in the mix. As you said, I believe once we have system properties supported this will be a no brainer and indeed I believe this belongs to an earlier "properties substitution" phase (as mentioned above). @Noble is there anyone building it? Oh yes , but beyond that, this will open up opportunities to develop plugins to IDE's/TextEditors for Solr... even just for better support in writing the schema files with auto-completion, validation, etc... Why do we need this magic in String-> Object conversion at all? Well, my obvious response is because of the nature of Solr configuration which is text based while at runtime you're dealing with other data types. Of course you can just create String setters and do the conversion yourself, but why do that if you can have done automatically and keep your classes clean. Just to be clear, the magic is not really "magic" we can be very clear about what converters are supported out of the box and (as I mentioned above) with the @Converter annotation users can be more explicit in how they want the conversion to take place. Bottom line, in the end of the day you want to be able to focus and write the plugins as POJO's using properties of the correct data types and focus on the plugin's logic rather than also focusing on configuration logic.
        Hide
        Uri Boness added a comment -

        In this patch I removed the need for the @InitProperty annotation. Instead any setter in the class will be considered as an initialization property. You can use the @Required annotation to mark properties as mandatory and the @ArgumentName to customize the name of the argument used to initialize it.

        Show
        Uri Boness added a comment - In this patch I removed the need for the @InitProperty annotation. Instead any setter in the class will be considered as an initialization property. You can use the @Required annotation to mark properties as mandatory and the @ArgumentName to customize the name of the argument used to initialize it.
        Hide
        Hoss Man added a comment -

        Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email...

        http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E

        Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed.

        A unique token for finding these 240 issues in the future: hossversioncleanup20100527

        Show
        Hoss Man added a comment - Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email... http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed. A unique token for finding these 240 issues in the future: hossversioncleanup20100527
        Hide
        Robert Muir added a comment -

        Bulk move 3.2 -> 3.3

        Show
        Robert Muir added a comment - Bulk move 3.2 -> 3.3
        Hide
        Robert Muir added a comment -

        3.4 -> 3.5

        Show
        Robert Muir added a comment - 3.4 -> 3.5
        Hide
        Hoss Man added a comment -

        Bulk of fixVersion=3.6 -> fixVersion=4.0 for issues that have no assignee and have not been updated recently.

        email notification suppressed to prevent mass-spam
        psuedo-unique token identifying these issues: hoss20120321nofix36

        Show
        Hoss Man added a comment - Bulk of fixVersion=3.6 -> fixVersion=4.0 for issues that have no assignee and have not been updated recently. email notification suppressed to prevent mass-spam psuedo-unique token identifying these issues: hoss20120321nofix36
        Hide
        Steve Rowe added a comment -

        Bulk move 4.4 issues to 4.5 and 5.0

        Show
        Steve Rowe added a comment - Bulk move 4.4 issues to 4.5 and 5.0
        Hide
        Uwe Schindler added a comment -

        Move issue to Solr 4.9.

        Show
        Uwe Schindler added a comment - Move issue to Solr 4.9.

          People

          • Assignee:
            Unassigned
            Reporter:
            Uri Boness
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development