Solr
  1. Solr
  2. SOLR-3446

PatternSyntaxException Crash from Unvalidated Regular Expression Usage

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.5
    • Fix Version/s: 4.0-ALPHA
    • Component/s: None
    • Labels:
      None

      Description

      Solr sometimes crashes with an unhelpful stack trace. If the PatternTokenizerFactory's "pattern" attribute is set to an invalid regular expression, a PatternSyntaxException is thrown and Solr fails to start. The PatternSyntaxException is not useful to users in diagnosing the error. I think it would be better to report a detailed error message. The attached patch makes this change.

      Note that the patch adds a small RegexUtil class with helper methods to determine whether a String is a valid regular expression and to generate error messages for invalid regular expressions. I feel that these helper methods are more readable than catching the PatternSyntaxException. Furthermore, they can be re-used if more bugs like this one are found.

      Steps to reproduce:

      1. Patch in bug.patch
        • Note that this sets PatternTokenizerFactory's pattern attribute to an invalid regular expression.
      2. Run 'ant run-example' from the solr folder
      3. See exception in console output on startup:
      Apr 3, 2012 2:07:29 PM org.apache.solr.common.SolrException log
      SEVERE: java.util.regex.PatternSyntaxException: Unclosed group near index 1
      (
       ^
             at java.util.regex.Pattern.error(Pattern.java:1713)
             at java.util.regex.Pattern.accept(Pattern.java:1571)
             at java.util.regex.Pattern.group0(Pattern.java:2533)
             at java.util.regex.Pattern.sequence(Pattern.java:1806)
             at java.util.regex.Pattern.expr(Pattern.java:1752)
             at java.util.regex.Pattern.compile(Pattern.java:1460)
             at java.util.regex.Pattern.<init>(Pattern.java:1133)
             at java.util.regex.Pattern.compile(Pattern.java:847)
             at org.apache.solr.analysis.PatternTokenizerFactory.init(PatternTokenizerFactory.java:90)
             at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:901)
             at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:890)
             at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:148)
             at org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:910)
             at org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:62)
             at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:450)
             at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:435)
             at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140)
             at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:480)
             at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:125)
             at org.apache.solr.core.CoreContainer.create(CoreContainer.java:461)
             at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316)
             at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207)
             at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130)
             at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94)
             at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97)
             at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
             at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713)
             at org.mortbay.jetty.servlet.Context.startContext(Context.java:140)
             at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282)
             at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518)
             at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499)
             at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
             at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
             at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
             at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
             at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
             at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
             at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
             at org.mortbay.jetty.Server.doStart(Server.java:224)
             at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
             at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985)
             at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
             at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
             at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
             at java.lang.reflect.Method.invoke(Method.java:597)
             at org.mortbay.start.Main.invokeMain(Main.java:194)
             at org.mortbay.start.Main.start(Main.java:534)
             at org.mortbay.start.Main.start(Main.java:441)
             at org.mortbay.start.Main.main(Main.java:119)
      

      4. Visit http://localhost:8983/solr/admin/ and see a similar message:

      HTTP ERROR 500
      
      Problem accessing /solr/admin/. Reason:
      
          Severe errors in solr configuration.
      
      Check your log files for more detailed information on what may be wrong.
      
      If you want solr to continue after configuration errors, change: 
      
       <abortOnConfigurationError>false</abortOnConfigurationError>
      
      in solr.xml
      
      ...
      
      -------------------------------------------------------------
      java.util.regex.PatternSyntaxException: Unclosed group near index 1
      (
       ^
             at java.util.regex.Pattern.error(Pattern.java:1713)
             at java.util.regex.Pattern.accept(Pattern.java:1571)
             at java.util.regex.Pattern.group0(Pattern.java:2533)
             at java.util.regex.Pattern.sequence(Pattern.java:1806)
             at java.util.regex.Pattern.expr(Pattern.java:1752)
             at java.util.regex.Pattern.compile(Pattern.java:1460)
             at java.util.regex.Pattern.<init>(Pattern.java:1133)
             at java.util.regex.Pattern.compile(Pattern.java:847)
             at org.apache.solr.analysis.PatternTokenizerFactory.init(PatternTokenizerFactory.java:90)
             at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:901)
             at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:890)
             at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:148)
      ...
      

      After applying the patch, the following is printed to the console:

      SEVERE: org.apache.solr.common.SolrException: invalid "pattern" regular expression for PatternTokenizerFactory: Unclosed group near index 1
      (
       ^
          at org.apache.solr.analysis.PatternTokenizerFactory.init(PatternTokenizerFactory.java:90)
          at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:901)
          at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:890)
          at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:148)
      ...
      

      And the following similar message is shown when visiting http://localhost:8983/solr/admin/ :

      HTTP ERROR 500
      
      Problem accessing /solr/admin/. Reason:
      
          Severe errors in solr configuration.
      
      Check your log files for more detailed information on what may be wrong.
      
      If you want solr to continue after configuration errors, change: 
      
       <abortOnConfigurationError>false</abortOnConfigurationError>
      
      in solr.xml
      
      ...
      
      -------------------------------------------------------------
      org.apache.solr.common.SolrException: invalid "pattern" regular expression for PatternTokenizerFactory: Unclosed group near index 1
      (
       ^
             at org.apache.solr.analysis.PatternTokenizerFactory.init(PatternTokenizerFactory.java:90)
             at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:901)
             at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:890)
             at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:148)
      ...
      
      1. bug.patch
        0.7 kB
        Eric Spishak
      2. SOLR-3446.patch
        2 kB
        Eric Spishak

        Activity

        Hide
        Hoss Man added a comment -

        Eric: I'm not really a fan of the "isRegex" and "regexError" utilities you proposed in your patch, because they throw away info about the details of the exception, and require redundant calls to Pattern.compile in situations where it's known that Pattern.compile is going to fail.

        But the crux of your bug report is spot on – PatternTokenizerFactory was dealing with the pattern init param poorly, and should have been following the model used in PatternReplaceCharFilterFactory and PatternReplaceFilterFactory – however even with those, the general plugin loading mechanism wasn't doing a very good job of drawing attention to where exactly the problem was.

        So i've committed a fix that:

        • refactors those three factories to use a common "getPattern" method in the base class
        • improves AbstractPluginLoader to include the name of the plugin that had a problem (if known)

        Committed revision 1342489.

        thanks for opening the bug and drawing attention to this!

        Show
        Hoss Man added a comment - Eric: I'm not really a fan of the "isRegex" and "regexError" utilities you proposed in your patch, because they throw away info about the details of the exception, and require redundant calls to Pattern.compile in situations where it's known that Pattern.compile is going to fail. But the crux of your bug report is spot on – PatternTokenizerFactory was dealing with the pattern init param poorly, and should have been following the model used in PatternReplaceCharFilterFactory and PatternReplaceFilterFactory – however even with those, the general plugin loading mechanism wasn't doing a very good job of drawing attention to where exactly the problem was. So i've committed a fix that: refactors those three factories to use a common "getPattern" method in the base class improves AbstractPluginLoader to include the name of the plugin that had a problem (if known) Committed revision 1342489. thanks for opening the bug and drawing attention to this!

          People

          • Assignee:
            Hoss Man
            Reporter:
            Eric Spishak
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development