Solr
  1. Solr
  2. SOLR-921

SolrResourceLoader must cache short name vs fully qualified name

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.4
    • Component/s: None
    • Labels:
      None

      Description

      every class that is loaded through SolrResourceLoader does a Class.forName() and when if it is not found a ClassNotFoundExcepton is thrown

      Then , it looks up with the various packages and finds the right class if the name starts with solr. Considering the fact that we usually use this solr.<classname> format we pay too much of a price for this. After every lookup the result can be cached in a static Map<String, String> with short name as keys and fully qualified name as values and can be shared across all the cores and this Map can be stored at the CoreContainer level.

      1. SOLR-921.patch
        3 kB
        Noble Paul
      2. SOLR-921.patch
        3 kB
        Noble Paul
      3. SOLR-921.patch
        3 kB
        Noble Paul
      4. SOLR-921.patch
        3 kB
        Noble Paul

        Issue Links

          Activity

          Hide
          Grant Ingersoll added a comment -

          Bulk close for Solr 1.4

          Show
          Grant Ingersoll added a comment - Bulk close for Solr 1.4
          Hide
          Shalin Shekhar Mangar added a comment -

          Committed revision 745394.

          Thanks Noble and Hoss!

          Show
          Shalin Shekhar Mangar added a comment - Committed revision 745394. Thanks Noble and Hoss!
          Hide
          Noble Paul added a comment -

          with javadocs and the code is simplified

          Show
          Noble Paul added a comment - with javadocs and the code is simplified
          Hide
          Noble Paul added a comment -

          Hi Hoss, will do , all three

          Show
          Noble Paul added a comment - Hi Hoss, will do , all three
          Hide
          Hoss Man added a comment -

          I'd like to commit this new patch tomorrow.

          after sleeping on it last night I realized caching the String mappings from alias->FQN is a good idea, for the same reasons you mentioned – but let me ask you for three favors before you commit:

          1) let's add some comments in the javadocs to SolrResourceLoader.findClass noting that it does this caching, and go into some specifics ... particularly the fact that the cache is static.

          2) ditto for elaborating on the SolrReesourceLoader.loadClass javadocs - the whole this.getClass().getClassLoader thing is going to be fairly non-intuitive, and right now it just refers to "base classloader" ...let's make sure future developers looking at the docs understand why it does what it does

          3) let's change the warning if a cached class name triggers a ClassNotFound to be a severe error and log the stack trace, stuff that should never happen is the stuff you really want to know about if somehow it magically does.

          Show
          Hoss Man added a comment - I'd like to commit this new patch tomorrow. after sleeping on it last night I realized caching the String mappings from alias->FQN is a good idea, for the same reasons you mentioned – but let me ask you for three favors before you commit: 1) let's add some comments in the javadocs to SolrResourceLoader.findClass noting that it does this caching, and go into some specifics ... particularly the fact that the cache is static. 2) ditto for elaborating on the SolrReesourceLoader.loadClass javadocs - the whole this.getClass().getClassLoader thing is going to be fairly non-intuitive, and right now it just refers to "base classloader" ...let's make sure future developers looking at the docs understand why it does what it does 3) let's change the warning if a cached class name triggers a ClassNotFound to be a severe error and log the stack trace, stuff that should never happen is the stuff you really want to know about if somehow it magically does.
          Hide
          Shalin Shekhar Mangar added a comment -

          Hoss, my concerns are:

          1. People may not even know that using the short name has a cost and they will come to accept it, unless they check the wiki.
          2. Even if someone wants to change his schema.xml and solrconfig.xml to use FQN, I see that as a time consuming non-trivial effort.
          3. The classloader caches the FQN vs Class. It is the negative lookups which are costly because firstly, they are not cached and secondly, Solr's retry logic is based on ClassNotFoundException being thrown.

          I'm still in favor of just documenting that people who want to create cores quickly should use the FQNs - especially since it keeps the code simpler - but i'm not opposed to caching the String->String mappings if you guys really think that's better.

          I really think this is a good solution. No more mucking around with the classloader. It solves the original problem well (i.e. subsequent instantiation can use the FQN which is fast) and the code is simpler to understand too. If there are no objections, I'd like to commit this new patch tomorrow.

          Show
          Shalin Shekhar Mangar added a comment - Hoss, my concerns are: People may not even know that using the short name has a cost and they will come to accept it, unless they check the wiki. Even if someone wants to change his schema.xml and solrconfig.xml to use FQN, I see that as a time consuming non-trivial effort. The classloader caches the FQN vs Class. It is the negative lookups which are costly because firstly, they are not cached and secondly, Solr's retry logic is based on ClassNotFoundException being thrown. I'm still in favor of just documenting that people who want to create cores quickly should use the FQNs - especially since it keeps the code simpler - but i'm not opposed to caching the String->String mappings if you guys really think that's better. I really think this is a good solution. No more mucking around with the classloader. It solves the original problem well (i.e. subsequent instantiation can use the FQN which is fast) and the code is simpler to understand too. If there are no objections, I'd like to commit this new patch tomorrow.
          Hide
          Hoss Man added a comment -

          But my question is , are we saying this fix is too complex and hard to understand? Or is it the "fear of the unknown" that is holding us back?

          My main concerns are:
          1) that caching classes adds complexity to solve a performance problem that could/should easily be avoidable by config change (using FQN instead of aliases)
          2) I am in fact scared that caching classes may have unexpected adverse behavior in atypical JVMs/servlet containers that we aren't familiar with ... mucking with class loading is something that's notoriously problematic, so i'd prefer we get involved in it only as a matter of last resort

          The package names of all the classes we use in solrconfig/schema is never advertised.

          First off: i don't think that really true. the example configs may never refer to the full name, but every place we document classes on the wiki we link to their javadcos with full names, and we do have a comment in the example configs explaining what full packages the "solr.*" prefix is an alias for.

          The aliasing feature was introduced early on to help the readability of the configs, and initially had essentially 0 cost (because the aliasing resolution was only done once at server startup). Documenting that people who deal with a high rate of core creation should avoid using the aliases (on the CoreAdmin page perhaps) and better advertising the full names of every class seems like a much simpler, risk free, solution.

          So we can just cache short-name -> FQN and we are still fine.

          Eh.

          I'm still in favor of just documenting that people who want to create cores quickly should use the FQNs – especially since it keeps the code simpler – but i'm not opposed to caching the String->String mappings if you guys really think that's better.

          Show
          Hoss Man added a comment - But my question is , are we saying this fix is too complex and hard to understand? Or is it the "fear of the unknown" that is holding us back? My main concerns are: 1) that caching classes adds complexity to solve a performance problem that could/should easily be avoidable by config change (using FQN instead of aliases) 2) I am in fact scared that caching classes may have unexpected adverse behavior in atypical JVMs/servlet containers that we aren't familiar with ... mucking with class loading is something that's notoriously problematic, so i'd prefer we get involved in it only as a matter of last resort The package names of all the classes we use in solrconfig/schema is never advertised. First off: i don't think that really true. the example configs may never refer to the full name, but every place we document classes on the wiki we link to their javadcos with full names, and we do have a comment in the example configs explaining what full packages the "solr.*" prefix is an alias for. The aliasing feature was introduced early on to help the readability of the configs, and initially had essentially 0 cost (because the aliasing resolution was only done once at server startup). Documenting that people who deal with a high rate of core creation should avoid using the aliases (on the CoreAdmin page perhaps) and better advertising the full names of every class seems like a much simpler, risk free, solution. So we can just cache short-name -> FQN and we are still fine. Eh. I'm still in favor of just documenting that people who want to create cores quickly should use the FQNs – especially since it keeps the code simpler – but i'm not opposed to caching the String->String mappings if you guys really think that's better.
          Hide
          Shalin Shekhar Mangar added a comment -

          afterall , we may not need to cache the classes themselves. Loading classes with FQN is as fast as a HashMap lookup after the class is loaded . So we can just cache short-name -> FQN and we are still fine.

          I think this is a much better approach. No danger of leaking class references. The correct one will always be loaded.

          Hoss, what do you think about this?

          Show
          Shalin Shekhar Mangar added a comment - afterall , we may not need to cache the classes themselves. Loading classes with FQN is as fast as a HashMap lookup after the class is loaded . So we can just cache short-name -> FQN and we are still fine. I think this is a much better approach. No danger of leaking class references. The correct one will always be loaded. Hoss, what do you think about this?
          Hide
          Noble Paul added a comment -

          afterall , we may not need to cache the classes themselves. Loading classes with FQN is as fast as a HashMap lookup after the class is loaded . So we can just cache short-name -> FQN and we are still fine.

          Show
          Noble Paul added a comment - afterall , we may not need to cache the classes themselves. Loading classes with FQN is as fast as a HashMap lookup after the class is loaded . So we can just cache short-name -> FQN and we are still fine.
          Hide
          Noble Paul added a comment - - edited

          here are the numbers

          before optimization
          -----------------------------
          core load x 13, 1313 calls, 442ms 
          
          Show
          Noble Paul added a comment - - edited here are the numbers before optimization ----------------------------- core load x 13, 1313 calls, 442ms
          Hide
          Noble Paul added a comment -

          We have a usecase where we load/unload tens of 1000's (yes tens of 1000s) of cores at runtime (and we have 10's of millions of cores spread across a farm of servers). We have identified this in profiling that this cost is indeed significant (among other things SOLR-919 , SOLR-920 ).

          As you said using FQN may obviate most of the problems.

          • But my question is , are we saying this fix is too complex and hard to understand? Or is it the "fear of the unknown" that is holding us back?
          • What are the failure cases?
          • The package names of all the classes we use in solrconfig/schema is never advertised. Though the users can figure it out, with some work , isn't it better to just solve it once and for all if it is a correct fix.
          Show
          Noble Paul added a comment - We have a usecase where we load/unload tens of 1000's (yes tens of 1000s) of cores at runtime (and we have 10's of millions of cores spread across a farm of servers). We have identified this in profiling that this cost is indeed significant (among other things SOLR-919 , SOLR-920 ). As you said using FQN may obviate most of the problems. But my question is , are we saying this fix is too complex and hard to understand? Or is it the "fear of the unknown" that is holding us back? What are the failure cases? The package names of all the classes we use in solrconfig/schema is never advertised. Though the users can figure it out, with some work , isn't it better to just solve it once and for all if it is a correct fix.
          Hide
          Hoss Man added a comment -

          Looking at the latest patch, I now see the restriction that the cache is only used if the packages being searched are the default list. I also see the new addition that only stores in the cache if the classloader used is the same as that of the SolrResourceLoader (to prevent bleed over from one core to another).

          But honestly: the whole idea of tricks like this seems overly risky to me considering how easy it is to get into "classloader hell" with java.

          Does anyone have any profiling info they can share showing that this patch actually improves performance in any particular usecases? (without adversely affecting it in the common use cases)

          Even if the package searching behavior of SolrResourceLoader really is that expensive, then I'd rather encourage people to stop using the solr.* aliasing and start explicitly using FQN for classes. I can't imagine Class.findClass on an already initialized classname is measurably slower then pulling a class instance out of this cache.

          (The short names were designed just to make the example configs easier to read, but if people in high load environments where cores are added/removed all the time find the package alias resolution to be prohibitively expensive on core initialization, then just don't use that that syntactic feature)

          If someone has numbers showing that the cache really is faster even when specifying FQNs in the configs, then I'd be convinced, but otherwise ..... ugh, ... it just seems like a bad idea to go down this road.

          Show
          Hoss Man added a comment - Looking at the latest patch, I now see the restriction that the cache is only used if the packages being searched are the default list. I also see the new addition that only stores in the cache if the classloader used is the same as that of the SolrResourceLoader (to prevent bleed over from one core to another). But honestly: the whole idea of tricks like this seems overly risky to me considering how easy it is to get into "classloader hell" with java. Does anyone have any profiling info they can share showing that this patch actually improves performance in any particular usecases? (without adversely affecting it in the common use cases) Even if the package searching behavior of SolrResourceLoader really is that expensive, then I'd rather encourage people to stop using the solr.* aliasing and start explicitly using FQN for classes. I can't imagine Class.findClass on an already initialized classname is measurably slower then pulling a class instance out of this cache. (The short names were designed just to make the example configs easier to read, but if people in high load environments where cores are added/removed all the time find the package alias resolution to be prohibitively expensive on core initialization, then just don't use that that syntactic feature) If someone has numbers showing that the cache really is faster even when specifying FQNs in the configs, then I'd be convinced, but otherwise ..... ugh, ... it just seems like a bad idea to go down this road.
          Hide
          Shalin Shekhar Mangar added a comment -

          Hoss, the latest patch fixes the issue I noted. Do you still see any problems with the idea/patch?

          Show
          Shalin Shekhar Mangar added a comment - Hoss, the latest patch fixes the issue I noted. Do you still see any problems with the idea/patch?
          Hide
          Noble Paul added a comment -

          do a
          Class c = Class.forName("cname",this.classLoader)
          //and if the
          classloader == SolrResourceLoader.class.getClassLoader()
          //then cache it

          Show
          Noble Paul added a comment - do a Class c = Class.forName("cname",this.classLoader) //and if the classloader == SolrResourceLoader.class.getClassLoader() //then cache it
          Hide
          Shalin Shekhar Mangar added a comment -

          Hoss, the use-case is for a server with very large number of cores with cores being loaded/unloaded all the time.

          For your concern #1 – The code does not cache if the package list passed to the method is different from the default list of packages (which are always loaded by the webapp classloader. So these can be shared by all cores.

          On #2 – When you put custom classes in $solr_home/lib, they have different packages from Solr's own packages. So one would most likely put the fully qualified class name. In that case the caching won't happen.

          The only problem right now is that if you want to override a class supplied by Solr by adding a jar to the $solr_home/lib, it won't take precedence. This can be fixed easily before we commit.

          Show
          Shalin Shekhar Mangar added a comment - Hoss, the use-case is for a server with very large number of cores with cores being loaded/unloaded all the time. For your concern #1 – The code does not cache if the package list passed to the method is different from the default list of packages (which are always loaded by the webapp classloader. So these can be shared by all cores. On #2 – When you put custom classes in $solr_home/lib, they have different packages from Solr's own packages. So one would most likely put the fully qualified class name. In that case the caching won't happen. The only problem right now is that if you want to override a class supplied by Solr by adding a jar to the $solr_home/lib, it won't take precedence. This can be fixed easily before we commit.
          Hide
          Noble Paul added a comment - - edited

          The patch currently caches the result only if the default set of packages are used . If you pass an extra list of package names , then the result is not cached. The ideal solution is to consider the package names also in the key . I have ignored those usecases for simplicity. I have also ignored cases where classes are loaded by parent classloader. Ideally the classloader also must be considered for making the key for the cache .

          This is useful when cores are loaded/unloaded very frequently and there are a large number of cores (tens of thousands) . In other cases the perf benefits are negligible.

          When loading plugins they are rarely loaded using the solr.<cname> .If we use a fully qualified name then the ClassNotFoundExceptions are not thrown and the cost is low and not worth optimizing. So I have ignored all such cases

          Caching the classes on a SolrResopurceLoader instance level means one core cannot benefit from the 'learnings' of another core.

          Show
          Noble Paul added a comment - - edited The patch currently caches the result only if the default set of packages are used . If you pass an extra list of package names , then the result is not cached. The ideal solution is to consider the package names also in the key . I have ignored those usecases for simplicity. I have also ignored cases where classes are loaded by parent classloader. Ideally the classloader also must be considered for making the key for the cache . This is useful when cores are loaded/unloaded very frequently and there are a large number of cores (tens of thousands) . In other cases the perf benefits are negligible. When loading plugins they are rarely loaded using the solr.<cname> .If we use a fully qualified name then the ClassNotFoundExceptions are not thrown and the cost is low and not worth optimizing. So I have ignored all such cases Caching the classes on a SolrResopurceLoader instance level means one core cannot benefit from the 'learnings' of another core.
          Hide
          Hoss Man added a comment -

          if i understand correctly, the goal here is to reduce the number of Class.forName calls on the same "cname" once we've already retrieved an instance of a class using that cname, and the patch does this by caching in a static map from cname=>Class

          two things concern me about this...

          1) the first time findClass("solr.FooBarBaz", "yak", "wak") is called the patched code could make at most three Class.forName calls ("solr.FooBarBaz", "yak.FooBarBaz", "wak.FooBarBaz"). assume "wak.FooBarBaz" winds up being the true class name that acutally get's loaded. the next time someone calls findClass("solr.FooBarBaz", "yak", "wak") no calls to Class.forName are made because "solr.FooBarBaz" is found in the cache. The problem comes up when a call is made to findClass("solr.FooBarBaz", "xxxxx", "yyyyy") ... "wak.FooBarBaz" is returned from the cache, even though that package wasn't even on the list of packages hte client was interested in. if we're going to cache these results, the full arg list needs to be the cache key

          2) the cache is a static map in the SolrResourceLoader class. unless i'm mistaken, there's only going to be a single instance of that for the entire Solr app (SolrResourceLoader will be loaded by the main/parent loader) which means that cache will cause Class object references to bleed over from one SolrCore to another (ie: i have a FooBarBaz plugin in my core1 lib, and that's different then the FooBarBaz plugin in core2 (or a differnet version). Changing the cacche to just be an instance variable of SolrResourceLoader should solve this ... there's only one SolrResourceLoader per core (right?)

          Show
          Hoss Man added a comment - if i understand correctly, the goal here is to reduce the number of Class.forName calls on the same "cname" once we've already retrieved an instance of a class using that cname, and the patch does this by caching in a static map from cname=>Class two things concern me about this... 1) the first time findClass("solr.FooBarBaz", "yak", "wak") is called the patched code could make at most three Class.forName calls ("solr.FooBarBaz", "yak.FooBarBaz", "wak.FooBarBaz"). assume "wak.FooBarBaz" winds up being the true class name that acutally get's loaded. the next time someone calls findClass("solr.FooBarBaz", "yak", "wak") no calls to Class.forName are made because "solr.FooBarBaz" is found in the cache. The problem comes up when a call is made to findClass("solr.FooBarBaz", "xxxxx", "yyyyy") ... "wak.FooBarBaz" is returned from the cache, even though that package wasn't even on the list of packages hte client was interested in. if we're going to cache these results, the full arg list needs to be the cache key 2) the cache is a static map in the SolrResourceLoader class. unless i'm mistaken, there's only going to be a single instance of that for the entire Solr app (SolrResourceLoader will be loaded by the main/parent loader) which means that cache will cause Class object references to bleed over from one SolrCore to another (ie: i have a FooBarBaz plugin in my core1 lib, and that's different then the FooBarBaz plugin in core2 (or a differnet version). Changing the cacche to just be an instance variable of SolrResourceLoader should solve this ... there's only one SolrResourceLoader per core (right?)
          Hide
          Noble Paul added a comment -

          SolrResourceLoader is modified to load classes and cache them if they are not loaded by the $

          {solr.home}/lib classloader

          We need the caching where we use Solr specific classes which are not found in ${solr.home}

          /lib and we access them using solr.<classname> usually

          Show
          Noble Paul added a comment - SolrResourceLoader is modified to load classes and cache them if they are not loaded by the $ {solr.home}/lib classloader We need the caching where we use Solr specific classes which are not found in ${solr.home} /lib and we access them using solr.<classname> usually

            People

            • Assignee:
              Shalin Shekhar Mangar
              Reporter:
              Noble Paul
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development