Solr
  1. Solr
  2. SOLR-3737

StempelPolishStemFilterFactory can't find resource '/org/apache/lucene/analysis/pl/stemmer_20000.tbl'

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.0-BETA
    • Fix Version/s: 4.0, 5.0
    • Component/s: Schema and Analysis
    • Labels:
      None

      Description

      The Stempel stemmer appears to be broken under Solr in v4.0.0-BETA, very likely related to LUCENE-2510 / LUCENE-4044.

      When I add the following to the example, I get the below-listed exception on start-up:

      solrconfig.xml
      <lib dir="../../../contrib/analysis-extras/lucene-libs/" regex=".*\.jar"/>
      
      schema.xml
      <fields>
        <field name="content" type="text_pl" indexed="true" stored="false" multiValued="true"/>
      [...]
      <types>
        <fieldType name="text_pl" class="solr.TextField" positionIncrementGap="100">
          <analyzer>
            <tokenizer class="solr.StandardTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.StempelPolishStemFilterFactory"/>
          </analyzer>
        </fieldType>
      
      Solr console output
      [...]
      INFO: Adding 'file:/C:/temp/apache-solr-4.0.0-BETA/contrib/analysis-extras/lucene-libs/lucene-analyzers-stempel-4.0.0-BETA.jar' to classloader
      [...]
      SEVERE: null:java.lang.RuntimeException: java.io.IOException: Can't find resource '/org/apache/lucene/analysis/pl/stemmer_20000.tbl' in classpath or 'solr\collection1\conf/', cwd=C:\temp\apache-solr-4.0.0-BETA\example
              at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:116)
              at org.apache.solr.core.CoreContainer.create(CoreContainer.java:850)
              at org.apache.solr.core.CoreContainer.load(CoreContainer.java:539)
              at org.apache.solr.core.CoreContainer.load(CoreContainer.java:360)
              at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:309)
              at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:106)
      [...]
      Caused by: java.io.IOException: Can't find resource '/org/apache/lucene/analysis/pl/stemmer_20000.tbl' in classpath or 'solr\collection1\conf/', cwd=C:\temp\apache-solr-4.0.0-BETA\example
              at org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:314)
              at org.apache.lucene.analysis.stempel.StempelPolishStemFilterFactory.inform(StempelPolishStemFilterFactory.java:42)
              at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:613)
              at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:114)
              ... 44 more
      

        Activity

        Hide
        Uwe Schindler added a comment -

        Closed after release.

        Show
        Uwe Schindler added a comment - Closed after release.
        Hide
        Robert Muir added a comment -

        Thanks, nice catch!

        Show
        Robert Muir added a comment - Thanks, nice catch!
        Hide
        Uwe Schindler added a comment -

        +1

        Show
        Uwe Schindler added a comment - +1
        Hide
        Steve Rowe added a comment -

        This bug was initially reported by sausarkar on the solr-user mailing list: http://search-lucene.com/m/ZzUr2X927G?subj=Solr4+0+BETA+Error+when+StempelPolishStemFilterFactory

        Show
        Steve Rowe added a comment - This bug was initially reported by sausarkar on the solr-user mailing list: http://search-lucene.com/m/ZzUr2X927G?subj=Solr4+0+BETA+Error+when+StempelPolishStemFilterFactory
        Hide
        Steve Rowe added a comment -

        Here's a patch

        +1

        Tests pass, and Solr startup doesn't trigger the resource loading exception.

        Show
        Steve Rowe added a comment - Here's a patch +1 Tests pass, and Solr startup doesn't trigger the resource loading exception.
        Hide
        Uwe Schindler added a comment -

        I agree, using the datastructure multiple times is wrong. The thing should load its tbl by Class.getResourceAsStream(relativepath)

        Show
        Uwe Schindler added a comment - I agree, using the datastructure multiple times is wrong. The thing should load its tbl by Class.getResourceAsStream(relativepath)
        Hide
        Uwe Schindler added a comment -

        Sorry, Solr is correct. When directly passed to ResourceLoader the path is correct. The problem here is the way how ClasspathResourceLoader handles this. It uses Class.getResource() to load and thats wrong, because that one expects a "/" to be absolute. We have to fix this and maybe fix some tests not giving absolute paths.

        Show
        Uwe Schindler added a comment - Sorry, Solr is correct. When directly passed to ResourceLoader the path is correct. The problem here is the way how ClasspathResourceLoader handles this. It uses Class.getResource() to load and thats wrong, because that one expects a "/" to be absolute. We have to fix this and maybe fix some tests not giving absolute paths.
        Hide
        Robert Muir added a comment -

        I agree with Uwe: something seems off with the resource loader.

        But lets just not use it at all, and not load multiple copies of this datastructure in RAM.

        Here's a patch

        Show
        Robert Muir added a comment - I agree with Uwe: something seems off with the resource loader. But lets just not use it at all, and not load multiple copies of this datastructure in RAM. Here's a patch
        Hide
        Uwe Schindler added a comment -

        I think thats a bug with SolrResourceLoader, which handles slash incorrect. If you want to get a resource from ClassLoader it must start with / - I'll look into it!

        Show
        Uwe Schindler added a comment - I think thats a bug with SolrResourceLoader, which handles slash incorrect. If you want to get a resource from ClassLoader it must start with / - I'll look into it!
        Hide
        Steve Rowe added a comment -

        When I remove the initial slash from the resource path:

        StempelPolishStemFilterFactory.java
        35: private static final String STEMTABLE = "org/apache/lucene/analysis/pl/stemmer_20000.tbl";
        

        Then jar it up and substitute it for the stempel jar in the unpacked Solr distribution, the failure goes away.

        One problem: removing the initial slash causes TestStempelPolishFilterFactory.testBasics() to fail:

        java.io.IOException: Resource not found: org/apache/lucene/analysis/pl/stemmer_20000.tbl
        	at __randomizedtesting.SeedInfo.seed([D7338607BCC7C31C:EAEB282B84299D6C]:0)
        	at org.apache.lucene.analysis.util.ClasspathResourceLoader.openResource(ClasspathResourceLoader.java:67)
        	at org.apache.lucene.analysis.stempel.StempelPolishStemFilterFactory.inform(StempelPolishStemFilterFactory.java:42)
        	at org.apache.lucene.analysis.stempel.TestStempelPolishStemFilterFactory.testBasics(TestStempelPolishStemFilterFactory.java:34)
        	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        
        Show
        Steve Rowe added a comment - When I remove the initial slash from the resource path: StempelPolishStemFilterFactory.java 35: private static final String STEMTABLE = "org/apache/lucene/analysis/pl/stemmer_20000.tbl" ; Then jar it up and substitute it for the stempel jar in the unpacked Solr distribution, the failure goes away. One problem: removing the initial slash causes TestStempelPolishFilterFactory.testBasics() to fail: java.io.IOException: Resource not found: org/apache/lucene/analysis/pl/stemmer_20000.tbl at __randomizedtesting.SeedInfo.seed([D7338607BCC7C31C:EAEB282B84299D6C]:0) at org.apache.lucene.analysis.util.ClasspathResourceLoader.openResource(ClasspathResourceLoader.java:67) at org.apache.lucene.analysis.stempel.StempelPolishStemFilterFactory.inform(StempelPolishStemFilterFactory.java:42) at org.apache.lucene.analysis.stempel.TestStempelPolishStemFilterFactory.testBasics(TestStempelPolishStemFilterFactory.java:34) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

          People

          • Assignee:
            Unassigned
            Reporter:
            Steve Rowe
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development