Solr
  1. Solr
  2. SOLR-744

Patch to make ShingleFilter.outputUnigramsIfNoShingles (LUCENE-1370) available in Solr schema files

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.1, 4.0-ALPHA
    • Fix Version/s: 3.1, 4.0-ALPHA
    • Component/s: Schema and Analysis
    • Labels:
      None

      Description

      1. SOLR-744.patch
        3 kB
        Steve Rowe
      2. SOLR-744.patch
        1 kB
        Chris Harris

        Issue Links

          Activity

          Hide
          Tom Burton-West added a comment -

          I applied both this and LUCENE-1370 and there seems to be some problem with passing arguments from the ShingleFilterFactory to the ShingleFilter. The admin analyzer says that outputUnigramIfNoNgram=true

          org.apache.solr.analysis.ShingleFilterFactory

          {outputUnigrams=false, outputUnigramIfNoNgram=true}

          However, this does not seem to be getting set within the ShingleFilter and the admin analyzer shows nothing coming out of the ShingleFilterFactory when analyzing a query with a single word.
          when using the admin interface to query a single word, I also get no results.

          If I hack the patch by always setting outputUnigramsIfNoNgrams to true, everything works fine.
          (see below)

          If I am missing something or obviously doing something wrong, please let me know. In the meantime I will try to write a unit test and track down the problem. Is there an already existing unit test I could use as a model?

          Tom Burton-West
          ------------------------------------------------------

          Hack

          public void init(Map<String, String> args)

          { super.init(args); maxShingleSize = getInt("maxShingleSize", ShingleFilter.DEFAULT_MAX_SHINGLE_SIZE); outputUnigrams = getBoolean("outputUnigrams", true); outputUnigramIfNoNgrams = true; /** tbw lets always set it to true above * comment out the original code below getBoolean("outputUnigramIfNoNgram", false); **/ }
          Show
          Tom Burton-West added a comment - I applied both this and LUCENE-1370 and there seems to be some problem with passing arguments from the ShingleFilterFactory to the ShingleFilter. The admin analyzer says that outputUnigramIfNoNgram=true org.apache.solr.analysis.ShingleFilterFactory {outputUnigrams=false, outputUnigramIfNoNgram=true} However, this does not seem to be getting set within the ShingleFilter and the admin analyzer shows nothing coming out of the ShingleFilterFactory when analyzing a query with a single word. when using the admin interface to query a single word, I also get no results. If I hack the patch by always setting outputUnigramsIfNoNgrams to true, everything works fine. (see below) If I am missing something or obviously doing something wrong, please let me know. In the meantime I will try to write a unit test and track down the problem. Is there an already existing unit test I could use as a model? Tom Burton-West ------------------------------------------------------ Hack public void init(Map<String, String> args) { super.init(args); maxShingleSize = getInt("maxShingleSize", ShingleFilter.DEFAULT_MAX_SHINGLE_SIZE); outputUnigrams = getBoolean("outputUnigrams", true); outputUnigramIfNoNgrams = true; /** tbw lets always set it to true above * comment out the original code below getBoolean("outputUnigramIfNoNgram", false); **/ }
          Hide
          Chris Harris added a comment -

          Tom,

          The Lucene half of this patch pair adds unit tests to src/test/org/apache/lucene/analysis/shingle/ShingleFilterTest.java. Do those tests pass when you run them on your custom lucene build, after applying LUCENE-1370? (cd to the top-level of lucene and then run "ant test -Dtestcase=ShingleFilterTest".) I didn't add any tests for the Solr half of the patch pair, but I also don't know how you would test it in a productive manner.

          Show
          Chris Harris added a comment - Tom, The Lucene half of this patch pair adds unit tests to src/test/org/apache/lucene/analysis/shingle/ShingleFilterTest.java. Do those tests pass when you run them on your custom lucene build, after applying LUCENE-1370 ? (cd to the top-level of lucene and then run "ant test -Dtestcase=ShingleFilterTest".) I didn't add any tests for the Solr half of the patch pair, but I also don't know how you would test it in a productive manner.
          Hide
          Tom Burton-West added a comment -

          Hi Chris,

          Thanks for your kind reply. The lucene unit tests passed. It turns out that we had a configuration error that left an unpatched version of ShingleFilter on the classpath when Solr started up. Once we made sure that the patched version was loading, everything has been working just fine.

          Tom

          Show
          Tom Burton-West added a comment - Hi Chris, Thanks for your kind reply. The lucene unit tests passed. It turns out that we had a configuration error that left an unpatched version of ShingleFilter on the classpath when Solr started up. Once we made sure that the patched version was loading, everything has been working just fine. Tom
          Hide
          Steve Rowe added a comment -

          Updated patch to reflect changed option name from LUCENE-1370 (outputUnigramIfNoNgram -> outputUnigramsIfNoShingles. Added a simple test to TestShingleFilterFactory.java for the single input token case. Added a solr/CHANGES.txt entry.

          Unless there are objections, I will commit this in a couple of days, after LUCENE-1370 has been committed.

          Show
          Steve Rowe added a comment - Updated patch to reflect changed option name from LUCENE-1370 ( outputUnigramIfNoNgram -> outputUnigramsIfNoShingles . Added a simple test to TestShingleFilterFactory.java for the single input token case. Added a solr/CHANGES.txt entry. Unless there are objections, I will commit this in a couple of days, after LUCENE-1370 has been committed.
          Hide
          Steve Rowe added a comment -

          Committed: trunk revision 1006191, branch_3x revision 1006199

          Show
          Steve Rowe added a comment - Committed: trunk revision 1006191, branch_3x revision 1006199
          Hide
          Grant Ingersoll added a comment -

          Bulk close for 3.1.0 release

          Show
          Grant Ingersoll added a comment - Bulk close for 3.1.0 release

            People

            • Assignee:
              Steve Rowe
              Reporter:
              Chris Harris
            • Votes:
              2 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development