Solr
  1. Solr
  2. SOLR-68

Custom ClassLoader for "plugins"

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.1.0
    • Component/s: None
    • Labels:
      None

      Description

      After beating my head against my desk for a few hours yesterday trying to document how to load custom plugins (ie: Analyzers, RequestHandlers, Similarities, etc...) in the various Servlet Containers – only to discover that it is aparently impossible unless you use Resin, it occured to me in the wee hours of last night that since the only time we ever need to load "pluggable" classes is when explicitly lookup the class by name, we could make out own ClassLoader and use it ... so i whiped together a little patch to Config.java that would load JARs out of $solr.home}/lib and was seriously suprised to discover that it seemed to work.

      In the clod light of day, I am again suprised that I still think this might be a good idea, but i'm not very familiar with ClassLoader semantics, so i'm not sure if what i've done is an abomination or not – or if the idea is sound, but the implimentation is crap.

      I'm also not sure if it works in all cases: more testing of various Containers would be good, as well as testing more complex sitautions (ie: what if a class explicitly named as a plugin and loaded by this new classloader then uses reflection to load another class from the same Jar using Thread.currentThread().getContextClassLoader() ... will that fail?)

      So far I've quick and dirty testing with my apachecon JAR under apache-tomcat-5.5.17, the jetty start.jar we use for the example, resin-3.0.21 and jettyplus-5.1.11-- all of which seemed to work fine except for jettyplus-5.1.11 – but that may have been because of some other configuration problem I had.

        Activity

        Uwe Schindler made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Hoss Man made changes -
        Fix Version/s 1.1.0 [ 12312234 ]
        Hide
        Hoss Man added a comment -

        This bug was modified as part of a bulk update using the criteria...

        • Marked ("Resolved" or "Closed") and "Fixed"
        • Had no "Fix Version" versions
        • Was listed in the CHANGES.txt for 1.1

        The Fix Version for all 38 issues found was set to 1.1, email notification
        was suppressed to prevent excessive email.

        For a list of all the issues modified, search jira comments for this
        (hopefully) unique string: 20080415hossman3

        Show
        Hoss Man added a comment - This bug was modified as part of a bulk update using the criteria... Marked ("Resolved" or "Closed") and "Fixed" Had no "Fix Version" versions Was listed in the CHANGES.txt for 1.1 The Fix Version for all 38 issues found was set to 1.1, email notification was suppressed to prevent excessive email. For a list of all the issues modified, search jira comments for this (hopefully) unique string: 20080415hossman3
        Hoss Man made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Hoss Man added a comment -

        I did some more testing with tomcat and some contrib jars from Lucene Java ... didn't have any problems so i added a bit of javadoc and commited.

        Show
        Hoss Man added a comment - I did some more testing with tomcat and some contrib jars from Lucene Java ... didn't have any problems so i added a bit of javadoc and commited.
        Hide
        Hoss Man added a comment -

        FYI: the same simple test with jetty-6.0.1 worked. ... still haven't had a chance to try out any more involved tests with multiple jars

        Show
        Hoss Man added a comment - FYI: the same simple test with jetty-6.0.1 worked. ... still haven't had a chance to try out any more involved tests with multiple jars
        Hoss Man made changes -
        Assignee Hoss Man [ hossman ]
        Hide
        Mike Klaas added a comment -

        [[ Old comment, sent from unregistered email on Wed, 8 Nov 2006 14:40:07 -0800 ]]

        Just wanted to comment that I'm +1 on the idea--getting this workign
        with jetty was an annoyance. I'm not really qualified to comment on
        the approach, but it sounds reasonable.

        Show
        Mike Klaas added a comment - [[ Old comment, sent from unregistered email on Wed, 8 Nov 2006 14:40:07 -0800 ]] Just wanted to comment that I'm +1 on the idea--getting this workign with jetty was an annoyance. I'm not really qualified to comment on the approach, but it sounds reasonable.
        Hide
        Otis Gospodnetic added a comment -

        Hoss, Mortbay guys releases Jetty 6.0 and even 6.1 (or 6.0.1, not sure any more) a few weeks ago. Jetty 5.* is getting obsoleted, to it would be best to try Jetty 6.* with your class loader.

        Show
        Otis Gospodnetic added a comment - Hoss, Mortbay guys releases Jetty 6.0 and even 6.1 (or 6.0.1, not sure any more) a few weeks ago. Jetty 5.* is getting obsoleted, to it would be best to try Jetty 6.* with your class loader.
        Hide
        Hoss Man added a comment -

        FYI: I just did a fresh install of jetty-5.1.11 and confirmed that (for my simple test case of some self contained request handlers in a single JAR) it works with jetty and jettyplus.

        I'd like to do some more testing with just dropping lucene contrib JARs in, and having multiple JARs ... but assuming no suprises pop up, I think i may just not worry about some of the more extreme situtions i brought up on the mailing list since:
        1) those situations will hopefully be rare
        2) i'm not even sure if those situations will cause a problem
        3) if one of those situations does cause a problem, we can allways re-address the specifics of the ClassLoader independent of the overall "API" of having a $

        {solr.home}

        /lib directory
        4) if someone relaly does have a situation we can't address because of some specific quirks in the ClassLoader of their servlet container, nothing in this approach procludes then from unwrapping the WAR and ading their classes directly (which is the only option at the moment)

        Show
        Hoss Man added a comment - FYI: I just did a fresh install of jetty-5.1.11 and confirmed that (for my simple test case of some self contained request handlers in a single JAR) it works with jetty and jettyplus. I'd like to do some more testing with just dropping lucene contrib JARs in, and having multiple JARs ... but assuming no suprises pop up, I think i may just not worry about some of the more extreme situtions i brought up on the mailing list since: 1) those situations will hopefully be rare 2) i'm not even sure if those situations will cause a problem 3) if one of those situations does cause a problem, we can allways re-address the specifics of the ClassLoader independent of the overall "API" of having a $ {solr.home} /lib directory 4) if someone relaly does have a situation we can't address because of some specific quirks in the ClassLoader of their servlet container, nothing in this approach procludes then from unwrapping the WAR and ading their classes directly (which is the only option at the moment)
        Hide
        Hoss Man added a comment -

        >> the most 'naive' solution could be separate shared solr.jar in the same classpath/classloader as plugins...

        Yeah, I considered that briefly, but I had two big reservations about attempting that approach...

        1) Complicates installation in the "simple" case. right now users that wnat an "out of hte box" experience just need to drop a WAR file in place and create two config files. seperating the classes into an external JAR would complicate that by requiring them to put that JAR in their containers shared/common classpath.

        2) I'm 90% certain that would eliminate the ability to run multiple solr instances in the same servlet container, because of the way the SolrCore and much of hte COnfig information is implimented via Singletons ... if those classes were loaded by a common classloader, there would be only one instance per container – not one instance per webapp.

        Show
        Hoss Man added a comment - >> the most 'naive' solution could be separate shared solr.jar in the same classpath/classloader as plugins... Yeah, I considered that briefly, but I had two big reservations about attempting that approach... 1) Complicates installation in the "simple" case. right now users that wnat an "out of hte box" experience just need to drop a WAR file in place and create two config files. seperating the classes into an external JAR would complicate that by requiring them to put that JAR in their containers shared/common classpath. 2) I'm 90% certain that would eliminate the ability to run multiple solr instances in the same servlet container, because of the way the SolrCore and much of hte COnfig information is implimented via Singletons ... if those classes were loaded by a common classloader, there would be only one instance per container – not one instance per webapp.
        Hide
        Fuad Efendi added a comment -

        understood, thanks...
        As I understood (functional requirement): we need to be able to easily add plugins without making changes to solr.war, and this patch fixes that. (plugin depends on solr classes AND plugin is physically outside of solr.war); the most 'naive' solution could be separate shared solr.jar in the same classpath/classloader as plugins...

        Show
        Fuad Efendi added a comment - understood, thanks... As I understood (functional requirement): we need to be able to easily add plugins without making changes to solr.war, and this patch fixes that. (plugin depends on solr classes AND plugin is physically outside of solr.war); the most 'naive' solution could be separate shared solr.jar in the same classpath/classloader as plugins...
        Hide
        Hoss Man added a comment -

        > and we don't need an explicit instance of a ClassLoader at all... just put JARs in a classpath... and it works in
        > all containers, because we use 'parent' classloader with inherited permissions instead of touching the
        > 'container-managed' Thread...

        ...it's not that easy. If you need to load a class which refrences another class defined in the webapp itself, your approach breaks because the ClassLoader won't wlk back down the ClassLoader hierarchy to try and find it.

        Ie: if you create a "CustomRequestHandler implements SolrRequestHandler" and put it in a JAR which you then put in the "shared" lib dir for Tomcat, you'll get a class not found error for something like SolrQueryRequest when CustomRequestHandler is loaded, because SolrQueryRequest is defined in the WAR and "shared" class loader higher doesn't have access to those classes...

        http://tomcat.apache.org/tomcat-5.5-doc/class-loader-howto.html

        ...hence my idea to create a new ClassLoader instance which was a "child" of the class loader being used for the webapp itself, so when it delegates on classes it doesn't recognize, it delegates "up" to the solr.war class loader.

        Regarding your previous comment about the Nutch/Eclipse plugin model ... i'm not sure if that configuration syntax really fits for us ... it assumes a specific set of extension points, such that a single plugin might map to multiple points, and (as far as i can tell) each extension point is either bound to a single plugin, or hte plugins "chain" themselves.

        This doesn't really fit our use case, where you might want dozens of differnet analyzers used in differnet fields – ideally someone should be able to pull any jar out of the lucene/contrib directory containing an analyzer or a Similarity class they want to use, drop that jar into a specified location, and put the name of the schema where they want to use it.

        As for how Nutch does this ... they seem to be doing roughly the same thing as I'm trying for in this patch, except the more specific meaning of "plugin" allows them to have a seperate ClassLoader per plugin, that loads the exact jars enumerated in the plugins "manifest" (an xml config file; not a jar MANIFEST) .. but again i'm not sure if/how they deal with the possibility of a plugin loaded class using reflection to try and load another class later on.

        Show
        Hoss Man added a comment - > and we don't need an explicit instance of a ClassLoader at all... just put JARs in a classpath... and it works in > all containers, because we use 'parent' classloader with inherited permissions instead of touching the > 'container-managed' Thread... ...it's not that easy. If you need to load a class which refrences another class defined in the webapp itself, your approach breaks because the ClassLoader won't wlk back down the ClassLoader hierarchy to try and find it. Ie: if you create a "CustomRequestHandler implements SolrRequestHandler" and put it in a JAR which you then put in the "shared" lib dir for Tomcat, you'll get a class not found error for something like SolrQueryRequest when CustomRequestHandler is loaded, because SolrQueryRequest is defined in the WAR and "shared" class loader higher doesn't have access to those classes... http://tomcat.apache.org/tomcat-5.5-doc/class-loader-howto.html ...hence my idea to create a new ClassLoader instance which was a "child" of the class loader being used for the webapp itself, so when it delegates on classes it doesn't recognize, it delegates "up" to the solr.war class loader. Regarding your previous comment about the Nutch/Eclipse plugin model ... i'm not sure if that configuration syntax really fits for us ... it assumes a specific set of extension points, such that a single plugin might map to multiple points, and (as far as i can tell) each extension point is either bound to a single plugin, or hte plugins "chain" themselves. This doesn't really fit our use case, where you might want dozens of differnet analyzers used in differnet fields – ideally someone should be able to pull any jar out of the lucene/contrib directory containing an analyzer or a Similarity class they want to use, drop that jar into a specified location, and put the name of the schema where they want to use it. As for how Nutch does this ... they seem to be doing roughly the same thing as I'm trying for in this patch, except the more specific meaning of "plugin" allows them to have a seperate ClassLoader per plugin, that loads the exact jars enumerated in the plugins "manifest" (an xml config file; not a jar MANIFEST) .. but again i'm not sure if/how they deal with the possibility of a plugin loaded class using reflection to try and load another class later on.
        Hide
        Fuad Efendi added a comment -

        why do you need URLClassLoader?

        1) public class URLClassLoader extends SecureClassLoader
        2) "The classes that are loaded are by default granted permission only to access the URLs specified when the URLClassLoader was created."

        we need to create an instance defined in XML... just call Class.forName("org.zzz.Foo").newInstance()... it will throw an exception ClassNotFoundException if a class can't be found in a path, or NoClassDefFoundError in case of unresolved nested dependencies (am I right?)...

        Show
        Fuad Efendi added a comment - why do you need URLClassLoader? 1) public class URLClassLoader extends SecureClassLoader 2) "The classes that are loaded are by default granted permission only to access the URLs specified when the URLClassLoader was created." we need to create an instance defined in XML... just call Class.forName("org.zzz.Foo").newInstance()... it will throw an exception ClassNotFoundException if a class can't be found in a path, or NoClassDefFoundError in case of unresolved nested dependencies (am I right?)...
        Hide
        Fuad Efendi added a comment -

        >...is when explicitly lookup the class by name, we could make out own ClassLoader and use it...

        The simplest way is to create a Class object is:
        Class fooClass = Class.forName("Foo")
        which is equivalent to:
        Class.forName("Foo", true, this.getClass().getClassLoader())

        Then,
        Foo fooObject = fooClass.newInstance()

        and we don't need an explicit instance of a ClassLoader at all... just put JARs in a classpath... and it works in all containers, because we use 'parent' classloader with inherited permissions instead of touching the 'container-managed' Thread...

        Sorry, still trying to understand why we would need that method:
        public static Class findClass(String cname, String... subpackages)
        even here, we can loop in [subpackage+"."+cname] using Class.forName()...

        Show
        Fuad Efendi added a comment - >...is when explicitly lookup the class by name, we could make out own ClassLoader and use it... The simplest way is to create a Class object is: Class fooClass = Class.forName("Foo") which is equivalent to: Class.forName("Foo", true, this.getClass().getClassLoader()) Then, Foo fooObject = fooClass.newInstance() and we don't need an explicit instance of a ClassLoader at all... just put JARs in a classpath... and it works in all containers, because we use 'parent' classloader with inherited permissions instead of touching the 'container-managed' Thread... Sorry, still trying to understand why we would need that method: public static Class findClass(String cname, String... subpackages) even here, we can loop in [subpackage+"."+cname] using Class.forName()...
        Hide
        Fuad Efendi added a comment -

        It was done in Eclipse, for instance. Nutch project also has huge 'plugins'-supporting codebase which are automatically loaded and 'wired' together without explicit XML definitions, they did it Eclipse-way...

        I mean if you use explicit XML like
        <filter class="org.zzz.my.solr.filters.LowerCaseFilter"/>

        • you don't need to design smth like Eclipse or Nutch... A lot of headache just to replace strings like "org.apache.nutch.parse.html.HtmlParser" by
          <name>plugin.includes</name>
          <value>protocol-http|urlfilter-regex|parse-(text|html|js)|index-basic|query-(basic|site|url)|summary-basic|scoring-opic</value>

        Here, "parse-(text|html|js)" will look for a jar file with ID "parse-html" in plugin.xml file (inside the jar file):
        <plugin
        id="parse-html"
        name="Html Parse Plug-in"
        version="1.0.0"
        provider-name="nutch.org">
        ...

        The main reason for such design: to avoid explicit names "org.apache.nutch.parse.html.HtmlParser" in main configuration XML, to allow third-parties to develop compatible plugins (which could have very different complicated extension points and implementations with specific settings, not defined in main Solr schema.xml file)...

        I don't think it would work in WebLogic without some explicit settings; and JEE does not easily allow direct access to a file system, especially to load a class 'from a different context' (security considerations)...

        Show
        Fuad Efendi added a comment - It was done in Eclipse, for instance. Nutch project also has huge 'plugins'-supporting codebase which are automatically loaded and 'wired' together without explicit XML definitions, they did it Eclipse-way... I mean if you use explicit XML like <filter class="org.zzz.my.solr.filters.LowerCaseFilter"/> you don't need to design smth like Eclipse or Nutch... A lot of headache just to replace strings like "org.apache.nutch.parse.html.HtmlParser" by <name>plugin.includes</name> <value>protocol-http|urlfilter-regex|parse-(text|html|js)|index-basic|query-(basic|site|url)|summary-basic|scoring-opic</value> Here, "parse-(text|html|js)" will look for a jar file with ID "parse-html" in plugin.xml file (inside the jar file): <plugin id="parse-html" name="Html Parse Plug-in" version="1.0.0" provider-name="nutch.org"> ... The main reason for such design: to avoid explicit names "org.apache.nutch.parse.html.HtmlParser" in main configuration XML, to allow third-parties to develop compatible plugins (which could have very different complicated extension points and implementations with specific settings, not defined in main Solr schema.xml file)... I don't think it would work in WebLogic without some explicit settings; and JEE does not easily allow direct access to a file system, especially to load a class 'from a different context' (security considerations)...
        Hoss Man made changes -
        Field Original Value New Value
        Attachment classloader.patch [ 12344503 ]
        Hide
        Hoss Man added a comment -

        patch that creates a new classloader containing any JARs found in $

        {solr.home}

        /lib and uses that class loader anytime class names read from a config file.

        Show
        Hoss Man added a comment - patch that creates a new classloader containing any JARs found in $ {solr.home} /lib and uses that class loader anytime class names read from a config file.
        Hoss Man created issue -

          People

          • Assignee:
            Hoss Man
            Reporter:
            Hoss Man
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development