Solr
  1. Solr
  2. SOLR-2493

SolrQueryParser constantly parse luceneMatchVersion in solrconfig. Large performance hit.

    Details

      Description

      I' m putting this as blocker as I think this is a serious issue that should be adressed asap with a release. With the current code this is no way near suitable for production use.

      For each instance created SolrQueryParser calls

      getSchema().getSolrConfig().getLuceneVersion("luceneMatchVersion", Version.LUCENE_24)

      instead of using

      getSchema().getSolrConfig().luceneMatchVersion

      This creates a massive performance hit. For each request, there is generally 3 query parsers created and each of them will parse the xml node in config which involve creating an instance of XPath and behind the scene the usual factory finder pattern quicks in within the xml parser and does a loadClass.

      The stack is typically:

      at org.mortbay.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:363)
      at com.sun.org.apache.xml.internal.dtm.ObjectFactory.findProviderClass(ObjectFactory.java:506)
      at com.sun.org.apache.xml.internal.dtm.ObjectFactory.lookUpFactoryClass(ObjectFactory.java:217)
      at com.sun.org.apache.xml.internal.dtm.ObjectFactory.createObject(ObjectFactory.java:131)
      at com.sun.org.apache.xml.internal.dtm.ObjectFactory.createObject(ObjectFactory.java:101)
      at com.sun.org.apache.xml.internal.dtm.DTMManager.newInstance(DTMManager.java:135)
      at com.sun.org.apache.xpath.internal.XPathContext.<init>(XPathContext.java:100)
      at com.sun.org.apache.xpath.internal.jaxp.XPathImpl.eval(XPathImpl.java:201)
      at com.sun.org.apache.xpath.internal.jaxp.XPathImpl.evaluate(XPathImpl.java:275)
      at org.apache.solr.core.Config.getNode(Config.java:230)
      at org.apache.solr.core.Config.getVal(Config.java:256)
      at org.apache.solr.core.Config.getLuceneVersion(Config.java:325)
      at org.apache.solr.search.SolrQueryParser.<init>(SolrQueryParser.java:76)
      at org.apache.solr.schema.IndexSchema.getSolrQueryParser(IndexSchema.java:277)

      With the current 3.1 code, I do barely 250 qps with 16 concurrent users with a near empty index.

      Switching SolrQueryParser to use getSchema().getSolrConfig().luceneMatchVersion and doing a quick bench test, performance become reasonable beyond 2000 qps.

      1. SOLR-2493.patch
        0.7 kB
        Uwe Schindler
      2. SOLR-2493-3.x.patch
        1 kB
        Uwe Schindler

        Issue Links

          Activity

          Hide
          Sami Siren added a comment -

          Trunk version of Solr has this same problem too, I just timed comparable difference in req/sec when caching the Version vs the current implementation.

          Show
          Sami Siren added a comment - Trunk version of Solr has this same problem too, I just timed comparable difference in req/sec when caching the Version vs the current implementation.
          Hide
          Uwe Schindler added a comment -

          This was broken by the commit of LUCENE-2458. Indeed, code should use the already parsed luceneMatchVersion from config (and that was also intended to be used like that from my original luceneMatchVersion patch, see SOLR-1677, but at this time QueryParser was not yet versionified and added later through LUCENE-2458).

          I think we should fix this by using the public field from SolrConfig, the current code is incorrect. I did not review that when that commit was made.

          To prevent such errors, can we make the XML unaccessible after parsing the config initially, so code trying to access XPath later fails?

          The fix is indeed simple, I will attach patch!

          Show
          Uwe Schindler added a comment - This was broken by the commit of LUCENE-2458 . Indeed, code should use the already parsed luceneMatchVersion from config (and that was also intended to be used like that from my original luceneMatchVersion patch, see SOLR-1677 , but at this time QueryParser was not yet versionified and added later through LUCENE-2458 ). I think we should fix this by using the public field from SolrConfig, the current code is incorrect. I did not review that when that commit was made. To prevent such errors, can we make the XML unaccessible after parsing the config initially, so code trying to access XPath later fails? The fix is indeed simple, I will attach patch!
          Hide
          Uwe Schindler added a comment -

          Patch for trunk, 3.x/3.1 is similar, will attach after merge.

          Show
          Uwe Schindler added a comment - Patch for trunk, 3.x/3.1 is similar, will attach after merge.
          Hide
          Uwe Schindler added a comment -

          Patch for 3.x and 3.1 branch.

          Show
          Uwe Schindler added a comment - Patch for 3.x and 3.1 branch.
          Hide
          Uwe Schindler added a comment -

          I also reviewed other places where luceneMatchVersion is used, all other places are correct (SpellChecker...).

          Show
          Uwe Schindler added a comment - I also reviewed other places where luceneMatchVersion is used, all other places are correct (SpellChecker...).
          Hide
          Uwe Schindler added a comment -

          Committed trunk revision: 1099340
          Merged 3.x revision: 1099347
          Merged 3.1 branch revision: 1099349

          You can fix this in you local installation by using the latest 3.1 stable branch, if you can't wait for 3.1.1

          Show
          Uwe Schindler added a comment - Committed trunk revision: 1099340 Merged 3.x revision: 1099347 Merged 3.1 branch revision: 1099349 You can fix this in you local installation by using the latest 3.1 stable branch, if you can't wait for 3.1.1
          Hide
          Uwe Schindler added a comment -

          Here the final 3.x patch (the prev one was incomplete)

          Show
          Uwe Schindler added a comment - Here the final 3.x patch (the prev one was incomplete)
          Hide
          Robert Muir added a comment -

          this wasn't broken by the lucene commit.

          this is solr's fault by having a getter that does some heavy duty xml shit.... I don't think the issue is fixed until these "getters" that parse xml are removed!!!!!

          Show
          Robert Muir added a comment - this wasn't broken by the lucene commit. this is solr's fault by having a getter that does some heavy duty xml shit.... I don't think the issue is fixed until these "getters" that parse xml are removed!!!!!
          Hide
          Uwe Schindler added a comment -

          ...as I said before

          Show
          Uwe Schindler added a comment - ...as I said before
          Hide
          Uwe Schindler added a comment -

          In my opinion, the correct way to solve this is to make all methods in o.a.solr.core.Config protected as they should only be called by subclasses doing the actual parsing.

          Uwe

          Show
          Uwe Schindler added a comment - In my opinion, the correct way to solve this is to make all methods in o.a.solr.core.Config protected as they should only be called by subclasses doing the actual parsing. Uwe
          Hide
          Robert Muir added a comment -

          In my opinion, the correct way to solve this is to make all methods in o.a.solr.core.Config protected as they should only be called by subclasses doing the actual parsing

          +1

          Show
          Robert Muir added a comment - In my opinion, the correct way to solve this is to make all methods in o.a.solr.core.Config protected as they should only be called by subclasses doing the actual parsing +1
          Hide
          Chris Male added a comment -

          In my opinion, the correct way to solve this is to make all methods in o.a.solr.core.Config protected as they should only be called by subclasses doing the actual parsing.

          +1

          We dont need getters doing parsing available to every component.

          Show
          Chris Male added a comment - In my opinion, the correct way to solve this is to make all methods in o.a.solr.core.Config protected as they should only be called by subclasses doing the actual parsing. +1 We dont need getters doing parsing available to every component.
          Hide
          Hoss Man added a comment -

          this is solr's fault by having a getter that does some heavy duty xml shit.

          that sounds like some serious buck passing.

          All of the "get" methods on the Config class take in xpath expressions – it should be obvious to any one who uses them that they are going to do xpath parsing.

          By the looks of it, the SolrConfig constructor was already creating a public final "luceneMatchVersion" variable in it's constructor (using the xml parsing based COnfig method) it just wasn't getting used by the query parser.

          In my opinion, the correct way to solve this is to make all methods in o.a.solr.core.Config protected as they should only be called by subclasses doing the actual parsing.

          I don't see how that would inherently protect us from this kind of mistake.

          The cause of the problem came from needing public access to a "getLuceneVersion" type method on SolrConfig (which is a subclass of Config)

          even if all the methods in COnfig were protected, that could have very easily wound up being implemented like so ...

            public Version getLuceneVersion() { return super.inefficientProtectedMethod(...) }
          

          ...and we would have had the same problem.

          Bottom line: we just need to be careful about how/when the Config XML parsing methods are used (protected or otherwise)

          Show
          Hoss Man added a comment - this is solr's fault by having a getter that does some heavy duty xml shit. that sounds like some serious buck passing. All of the "get" methods on the Config class take in xpath expressions – it should be obvious to any one who uses them that they are going to do xpath parsing. By the looks of it, the SolrConfig constructor was already creating a public final "luceneMatchVersion" variable in it's constructor (using the xml parsing based COnfig method) it just wasn't getting used by the query parser. In my opinion, the correct way to solve this is to make all methods in o.a.solr.core.Config protected as they should only be called by subclasses doing the actual parsing. I don't see how that would inherently protect us from this kind of mistake. The cause of the problem came from needing public access to a "getLuceneVersion" type method on SolrConfig (which is a subclass of Config) even if all the methods in COnfig were protected, that could have very easily wound up being implemented like so ... public Version getLuceneVersion() { return super .inefficientProtectedMethod(...) } ...and we would have had the same problem. Bottom line: we just need to be careful about how/when the Config XML parsing methods are used (protected or otherwise)
          Hide
          Robert Muir added a comment -

          All of the "get" methods on the Config class take in xpath expressions – it should be obvious to any one who uses them that they are going to do xpath parsing.

          How is that obvious? There's definitely no javadoc saying this. In general, if you have an api that contains XYZ and you add a getXYZ() with absolutely no javadocs that behaves as more than a getter, thats a trap.

          So I still agree with Uwe, it should be protected to prevent problems, also it would be nice if these methods were called *parse*XYZ() instead of *get*XYZ().

          Otherwise this is going to continue to happen!

          Show
          Robert Muir added a comment - All of the "get" methods on the Config class take in xpath expressions – it should be obvious to any one who uses them that they are going to do xpath parsing. How is that obvious? There's definitely no javadoc saying this. In general, if you have an api that contains XYZ and you add a getXYZ() with absolutely no javadocs that behaves as more than a getter, thats a trap. So I still agree with Uwe, it should be protected to prevent problems, also it would be nice if these methods were called *parse*XYZ() instead of *get*XYZ(). Otherwise this is going to continue to happen!
          Hide
          Stephane Bailliez added a comment -

          The problem is hardly about naming here, it is about correctly using classes when offered the choice. Mistake was made. That's it. We expect committers to be sufficiently knowledgeable about the codebase when committing code. That's true anywhere.

          You can hardly expect a service ItemService to have methods such as:

          getItemFromDatabase() or getItemFromServerOnTheOtherSideOfThePlanet() or getItemFromFile() or getItemFromMemory() if there are 4 different implementations of it., you have getItem() and the 4 different implementation do something different internally.

          I rather actually wonder why the config is not parsed entirely at startup rather than have nodes lying around and cherry-picked.

          Show
          Stephane Bailliez added a comment - The problem is hardly about naming here, it is about correctly using classes when offered the choice. Mistake was made. That's it. We expect committers to be sufficiently knowledgeable about the codebase when committing code. That's true anywhere. You can hardly expect a service ItemService to have methods such as: getItemFromDatabase() or getItemFromServerOnTheOtherSideOfThePlanet() or getItemFromFile() or getItemFromMemory() if there are 4 different implementations of it., you have getItem() and the 4 different implementation do something different internally. I rather actually wonder why the config is not parsed entirely at startup rather than have nodes lying around and cherry-picked.
          Hide
          Uwe Schindler added a comment - - edited

          The cause of the problem came from needing public access to a "getLuceneVersion" type method on SolrConfig (which is a subclass of Config)

          This is not true. getLuceneVersion is in Config not SolrConfig and its public like all the other getXxx() methods. Version is just a datatype like int/float/String. Thats all. It does not need to be public (like all other getters in Config class).

          In general the bad thing about the whole config stuff in solr is mixing parsing and value holder. This should theoretically separate classes. So SolrConfig has no parse methods at all. In its ctor it would simply instantiate the ConfigParser (name the class like that) and use it to set the values in SolrConfig. That would be cotrrect design.

          The good thing with this design: One could instantiate a SolrConfig and populate it programmatically or via a JSON parser or whatever.

          Show
          Uwe Schindler added a comment - - edited The cause of the problem came from needing public access to a "getLuceneVersion" type method on SolrConfig (which is a subclass of Config) This is not true. getLuceneVersion is in Config not SolrConfig and its public like all the other getXxx() methods. Version is just a datatype like int/float/String. Thats all. It does not need to be public (like all other getters in Config class). In general the bad thing about the whole config stuff in solr is mixing parsing and value holder. This should theoretically separate classes. So SolrConfig has no parse methods at all. In its ctor it would simply instantiate the ConfigParser (name the class like that) and use it to set the values in SolrConfig. That would be cotrrect design. The good thing with this design: One could instantiate a SolrConfig and populate it programmatically or via a JSON parser or whatever.
          Hide
          Uwe Schindler added a comment -

          I rather actually wonder why the config is not parsed entirely at startup rather than have nodes lying around and cherry-picked.

          It is mostly and should. The problem here is as noted before: SolrConfig subclasses Config which is only for parsing. SolrConfig should simply sublass Object and simply instantiate a parser on ctor to parse and store all parsed content in itsself. After that the parser is useless and can be freed. This would even free the DTM/DOM staying alive until Solr shuts down.

          Show
          Uwe Schindler added a comment - I rather actually wonder why the config is not parsed entirely at startup rather than have nodes lying around and cherry-picked. It is mostly and should. The problem here is as noted before: SolrConfig subclasses Config which is only for parsing. SolrConfig should simply sublass Object and simply instantiate a parser on ctor to parse and store all parsed content in itsself. After that the parser is useless and can be freed. This would even free the DTM/DOM staying alive until Solr shuts down.
          Hide
          Yonik Seeley added a comment -

          How is that obvious?

          The signature of these methods might be a tip-off:
          public double getDouble(String path, double def)

          One can't be passing a String and have no idea what the string is used for

          Show
          Yonik Seeley added a comment - How is that obvious? The signature of these methods might be a tip-off: public double getDouble(String path, double def) One can't be passing a String and have no idea what the string is used for
          Hide
          Uwe Schindler added a comment -

          We should not again fight here against each other. The problem is fixed, we could release 3.1.1 if we fixed the last slowdown in MultiPhraseQuery.

          The discussion here is just about how to prevent this. For me as a non-Solr comitter, when I did this code with Robert last year, I was also really confused about the design of Config (and in my opinion this is a wrong design). We should maybe open another issue and separate parsing and value-holding in two spearate classes (SolrConfig and ConfigParser). If we would do this all is solved (see above).

          Show
          Uwe Schindler added a comment - We should not again fight here against each other. The problem is fixed, we could release 3.1.1 if we fixed the last slowdown in MultiPhraseQuery. The discussion here is just about how to prevent this. For me as a non-Solr comitter, when I did this code with Robert last year, I was also really confused about the design of Config (and in my opinion this is a wrong design). We should maybe open another issue and separate parsing and value-holding in two spearate classes (SolrConfig and ConfigParser). If we would do this all is solved (see above).
          Hide
          Ryan McKinley added a comment -

          I was also really confused about the design of Config (and in my opinion this is a wrong design)

          Like many things in solr/lucene the current design is the product many incremental back-compatible changes – not a top down view of what it should be. I would love to use 4.0 as a chance to revisit configs and their relationship to xml/validation etc, but that is a load of work with very little glory...

          Show
          Ryan McKinley added a comment - I was also really confused about the design of Config (and in my opinion this is a wrong design) Like many things in solr/lucene the current design is the product many incremental back-compatible changes – not a top down view of what it should be. I would love to use 4.0 as a chance to revisit configs and their relationship to xml/validation etc, but that is a load of work with very little glory...
          Hide
          Uwe Schindler added a comment -

          Ryan: I agree, this is why I always bring this up. With 4.0 we can reimplement APIs.

          On the other hand: I thought Solr's backwards policy is about public HTTP-REST-APIs, why care on implementation details behind, why do we need to keep backwards? This is just a dumb question I never understood. As long as Solr behaves identical to the outside who cares if we change method signatures/class names?

          Show
          Uwe Schindler added a comment - Ryan: I agree, this is why I always bring this up. With 4.0 we can reimplement APIs. On the other hand: I thought Solr's backwards policy is about public HTTP-REST-APIs, why care on implementation details behind, why do we need to keep backwards? This is just a dumb question I never understood. As long as Solr behaves identical to the outside who cares if we change method signatures/class names?
          Hide
          Ryan McKinley added a comment -

          I thought Solr's backwards policy is about public HTTP-REST-APIs, why care on implementation details behind, why do we need to keep backwards? This is just a dumb question I never understood. As long as Solr behaves identical to the outside who cares if we change method signatures/class names?

          Yes, the compatibility priorities are for HTTP-REST APIs, next is probably the external config files – since changing them may be a hassel.

          In 4.0, we should be able to change the internal method signatures/class names pretty easily. If changing getDouble() to readDouble() makes things more clear, I'm +1 – but it still feels like a band-aid, one more incremental improvement.

          Long term, i would love to see the custom config system we have replaced with something standard... like spring, or simly POJOs that are loaded (and saved!) via XStream. This is the bigger pile of work I was referring to.

          Show
          Ryan McKinley added a comment - I thought Solr's backwards policy is about public HTTP-REST-APIs, why care on implementation details behind, why do we need to keep backwards? This is just a dumb question I never understood. As long as Solr behaves identical to the outside who cares if we change method signatures/class names? Yes, the compatibility priorities are for HTTP-REST APIs, next is probably the external config files – since changing them may be a hassel. In 4.0, we should be able to change the internal method signatures/class names pretty easily. If changing getDouble() to readDouble() makes things more clear, I'm +1 – but it still feels like a band-aid, one more incremental improvement. Long term, i would love to see the custom config system we have replaced with something standard... like spring, or simly POJOs that are loaded (and saved!) via XStream. This is the bigger pile of work I was referring to.
          Hide
          Yonik Seeley added a comment -

          Ryan: yep... +1 to all that.

          Show
          Yonik Seeley added a comment - Ryan: yep... +1 to all that.
          Hide
          Michael McCandless added a comment -

          Long term, i would love to see the custom config system we have replaced with something standard... like spring, or simly POJOs that are loaded (and saved!) via XStream. This is the bigger pile of work I was referring to.

          +1

          I think XML is an poor configuration language. It's great for one computer to talk to another, but for files that humans may edit, it's bad – too much stuff to type for the computer's benefit, too easy to make a silly mistake.

          I think something like Yaml is a better choice... this is what ElasticSearch uses, for example.

          And, while we're at it, I think we should make Solr's error checking brittle on startup: if anything is "off" about the configuration, the server refuses to start (see http://markmail.org/thread/ywkfmxjboyixkrjc).

          Show
          Michael McCandless added a comment - Long term, i would love to see the custom config system we have replaced with something standard... like spring, or simly POJOs that are loaded (and saved!) via XStream. This is the bigger pile of work I was referring to. +1 I think XML is an poor configuration language. It's great for one computer to talk to another, but for files that humans may edit, it's bad – too much stuff to type for the computer's benefit, too easy to make a silly mistake. I think something like Yaml is a better choice... this is what ElasticSearch uses, for example. And, while we're at it, I think we should make Solr's error checking brittle on startup: if anything is "off" about the configuration, the server refuses to start (see http://markmail.org/thread/ywkfmxjboyixkrjc ).
          Hide
          Jan Høydahl added a comment -

          +1

          @Michael, agree on this. But instead of relying on a monolithic solrconfig.xml file or .yml file, isn't it better to re-design configuration to fit a path/node concept more fine-grained (like ZK nodes)? It doesn't feel quite right to store solrconfig.xml and schema.xml as a huge string in the SolrCloud ZK schema. It would be better to have stuff like /solr/configs/configA/general/abortOnConfigurationError=false as a separate config node. Likewise /solr/configs/configA/schema/types/text_en to define fieldType text_en. The config concept won't need to be bound to ZK either. There could be pluggable backend implementations, where one could read/write the existing XML formats.

          Show
          Jan Høydahl added a comment - +1 @Michael, agree on this. But instead of relying on a monolithic solrconfig.xml file or .yml file, isn't it better to re-design configuration to fit a path/node concept more fine-grained (like ZK nodes)? It doesn't feel quite right to store solrconfig.xml and schema.xml as a huge string in the SolrCloud ZK schema. It would be better to have stuff like /solr/configs/configA/general/abortOnConfigurationError=false as a separate config node. Likewise /solr/configs/configA/schema/types/text_en to define fieldType text_en. The config concept won't need to be bound to ZK either. There could be pluggable backend implementations, where one could read/write the existing XML formats.
          Hide
          Michael McCandless added a comment -

          Jan, I don't have any experience with ZooKeeper, but that sounds neat

          Show
          Michael McCandless added a comment - Jan, I don't have any experience with ZooKeeper, but that sounds neat
          Hide
          Robert Muir added a comment -

          Bulk close for 3.2

          Show
          Robert Muir added a comment - Bulk close for 3.2

            People

            • Assignee:
              Uwe Schindler
              Reporter:
              Stephane Bailliez
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development