Solr
  1. Solr
  2. SOLR-4007

Morfologik dictionaries not available in Solr field type

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 4.1
    • Fix Version/s: 4.1
    • Component/s: Schema and Analysis
    • Labels:
      None

      Description

      The Polish Morfologik type does not find its dictionaries when used in Solr. To demonstrate:

      1) Add this to example/solr/collection1/conf/schema.xml:

          <!-- Polish -->
          <fieldType name="text_pl" class="solr.TextField" positionIncrementGap="100">
            <analyzer>
              <tokenizer class="solr.StandardTokenizerFactory"/>
              <filter class="solr.MorfologikFilterFactory" dictionary="MORFOLOGIK" />
            </analyzer>
          </fieldType>
      

      2) Add this to example/solr/collection1/conf/solrconfig.xml:

        <lib dir="../../../../lucene/build/analysis/morfologik/" regex=".*\.jar" />
        <lib dir="../../../contrib/analysis-extras/lib" regex=".*\.jar" />
        <lib dir="../../../dist/" regex="apache-solr-analysis-extras-\d.*\.jar" />
      

      3) Test 'text_pl' in the analysis page. You will get an exception.

      Oct 28, 2012 8:27:19 PM org.apache.solr.core.SolrCore execute
      INFO: [collection1] webapp=/solr path=/analysis/field params={analysis.showmatch=true&analysis.query=&wt=json&analysis.fieldvalue=blah+blah&analysis.fieldtype=text_pl} status=500 QTime=26 
      Oct 28, 2012 8:27:19 PM org.apache.solr.common.SolrException log
      SEVERE: null:java.lang.RuntimeException: Default dictionary resource for language 'plnot found.
      	at morfologik.stemming.Dictionary.getForLanguage(Dictionary.java:163)
      	at morfologik.stemming.PolishStemmer.<init>(PolishStemmer.java:64)
      	at org.apache.lucene.analysis.morfologik.MorfologikFilter.<init>(MorfologikFilter.java:70)
      	at org.apache.lucene.analysis.morfologik.MorfologikFilterFactory.create(MorfologikFilterFactory.java:63)
      	at org.apache.solr.handler.AnalysisRequestHandlerBase.analyzeValue(AnalysisRequestHandlerBase.java:125)
      	at org.apache.solr.handler.FieldAnalysisRequestHandler.analyzeValues(FieldAnalysisRequestHandler.java:220)
      	at org.apache.solr.handler.FieldAnalysisRequestHandler.handleAnalysisRequest(FieldAnalysisRequestHandler.java:181)
      	at org.apache.solr.handler.FieldAnalysisRequestHandler.doAnalysis(FieldAnalysisRequestHandler.java:100)
      	at 
      
      [...........]
      
      Caused by: java.io.IOException: Could not locate resource: morfologik/dictionaries/pl.dict
      	at morfologik.util.ResourceUtils.openInputStream(ResourceUtils.java:56)
      	at morfologik.stemming.Dictionary.getForLanguage(Dictionary.java:156)
      	... 38 more
      
      

      morfologik-polish-1.5.3.jar has morfologik/dictionaries/pl.dict.

        Issue Links

          Activity

          Hide
          Dawid Weiss added a comment -

          I'll take a look, this looks like classloader lookup order issue.

          Show
          Dawid Weiss added a comment - I'll take a look, this looks like classloader lookup order issue.
          Hide
          Dawid Weiss added a comment -

          Fixed, thanks Lance!

          Show
          Dawid Weiss added a comment - Fixed, thanks Lance!
          Hide
          Lance Norskog added a comment -

          What is the change? I would like to change my OpenNLP patch to work in the same directory/jar structure.

          Show
          Lance Norskog added a comment - What is the change? I would like to change my OpenNLP patch to work in the same directory/jar structure.
          Hide
          Dawid Weiss added a comment -

          Solr doesn't set context class loader and this is used by Morfologik internally to look up classes. If you look at my commit you'll see that the fix is to temporarily set context class loader to the one that loaded PolishStemmer (where the dictionaries reside). I don't know how it applies to your patch/code.

          Show
          Dawid Weiss added a comment - Solr doesn't set context class loader and this is used by Morfologik internally to look up classes. If you look at my commit you'll see that the fix is to temporarily set context class loader to the one that loaded PolishStemmer (where the dictionaries reside). I don't know how it applies to your patch/code.
          Hide
          Markus Jelsma added a comment -

          Although changes.txt mentions this for trunk, i'm still getting it despite having the three morfologik jars in the lib dir and both analysis extras jars.

          Show
          Markus Jelsma added a comment - Although changes.txt mentions this for trunk, i'm still getting it despite having the three morfologik jars in the lib dir and both analysis extras jars.
          Hide
          Dawid Weiss added a comment -

          Hi Markus. I believe the fix was all right – maybe it's something else. Can you provide a repeatable scenario of this failure?

          Show
          Dawid Weiss added a comment - Hi Markus. I believe the fix was all right – maybe it's something else. Can you provide a repeatable scenario of this failure?
          Hide
          Markus Jelsma added a comment -

          David, the fix is alright indeed, it appears i had a stale jar hanging around.
          Thanks

          Show
          Markus Jelsma added a comment - David, the fix is alright indeed, it appears i had a stale jar hanging around. Thanks
          Hide
          Commit Tag Bot added a comment -

          [branch_4x commit] Dawid Weiss
          http://svn.apache.org/viewvc?view=revision&revision=1404044

          SOLR-4007: Morfologik dictionaries not available in Solr field type
          due to class loader lookup problems. (Lance Norskog, Dawid Weiss)

          Show
          Commit Tag Bot added a comment - [branch_4x commit] Dawid Weiss http://svn.apache.org/viewvc?view=revision&revision=1404044 SOLR-4007 : Morfologik dictionaries not available in Solr field type due to class loader lookup problems. (Lance Norskog, Dawid Weiss)

            People

            • Assignee:
              Dawid Weiss
              Reporter:
              Lance Norskog
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development