Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.7, 6.0
    • Component/s: search
    • Labels:
      None

      Description

      The idea is to add a new Suggester Component that will eventually replace the Suggester support through the SpellCheck Component.
      This will enable Solr to fully utilize the Lucene suggester module (along with supporting most of the existing features) in the following ways:

      • Dictionary pluggability (give users the option to choose the dictionary implementation to use for their suggesters to consume)
      • Map the suggester options/ suggester result format (e.g. support for payload)
      • The new Component will also allow us to have "beefier" Lookup support instead of resorting to collation and such. (Move computation from query time to index time) with more freedom

      In addition to this, this suggester version should also have support for distributed support, which was awkward at best with the previous implementation due to SpellCheck requirements.

      Config (index time) options:

      • name - name of suggester
      • sourceLocation - external file location (for file-based suggesters)
      • lookupImpl - type of lookup to use [default JaspellLookupFactory]
      • dictionaryImpl - type of dictionary to use (lookup input) [default
        (sourceLocation == null ? HighFrequencyDictionaryFactory : FileDictionaryFactory)]
      • storeDir - location to store in-memory data structure in disk
      • buildOnCommit - command to build suggester for every commit
      • buildOnOptimize - command to build suggester for every optimize

      Query time options:

      • suggest.dictionary - name of suggester to use
      • suggest.count - number of suggestions to return
      • suggest.q - query to use for lookup
      • suggest.build - command to build the suggester
      • suggest.reload - command to reload the suggester

      Example query:

      http://localhost:8983/solr/suggest?suggest.dictionary=mySuggester&suggest=true&suggest.build=true&suggest.q=elec
      

      Distributed query:

      http://localhost:7574/solr/suggest?suggest.dictionary=mySuggester&suggest=true&suggest.build=true&suggest.q=elec&shards=localhost:8983/solr,localhost:7574/solr&shards.qt=/suggest
      

      Example Response:

      <response>
        <lst name="responseHeader">
          <int name="status">0</int>
          <int name="QTime">28</int>
        </lst>
        <str name="command">build</str>
        <result name="response" numFound="0" start="0" maxScore="0.0"/>
        <lst name="suggest">
          <lst name="suggestions">
            <lst name="e">
              <int name="numFound">2</int>
              <lst name="suggestion">
                <str name="term">electronics and computer1</str>
                <long name="weight">2199</long>
                <str name="payload"/>
              </lst>
              <lst name="suggestion">
                <str name="term">electronics</str>
                <long name="weight">649</long>
                <str name="payload"/>
              </lst>
            </lst>
          </lst>
        </lst>
      </response>
      

      Example config file:

      • Using DocumentDictionary and FuzzySuggester
        • Suggestion on product_name sorted by popularity with the additional product_id in the payload
        
        <searchComponent class="solr.SuggestComponent" name="suggest">
            <lst name="suggester">
              <str name="name">suggest_fuzzy_doc_dict</str>
              <str name="lookupImpl">FuzzyLookupFactory</str>
              <str name="dictionaryImpl">DocumentDictionaryFactory</str>
              <str name="field">product_name</str>
              <str name="weightField">popularity</str>
              <str name="payloadField">product_id</str>
              <str name="storeDir">suggest_fuzzy_doc_dict_payload</str>
              <str name="suggestAnalyzerFieldType">text</str>
            </lst>
          </searchComponent>
      
      • Using DocumentExpressionDictionary and FuzzySuggester
        • Suggestion on product_name sorted by the expression "((price * 2) + ln(popularity))" (where both price and popularity are fields in the document)
          <searchComponent class="solr.SuggestComponent" name="suggest">
            <lst name="suggester">
              <str name="name">suggest_fuzzy_doc_expr_dict</str>
              <str name="dictionaryImpl">DocumentExpressionDictionaryFactory</str>
              <str name="lookupImpl">FuzzyLookupFactory</str>
              <str name="field">product_name</str>
              <str name="weightExpression">((price * 2) + ln(popularity))</str>
              <str name="sortField">weight</str>
              <str name="sortField">price</str>
              <str name="strtoreDir">suggest_fuzzy_doc_expr_dict</str>
              <str name="suggestAnalyzerFieldType">text</str>
            </lst>
          </searchComponent>
      
      1. SOLR-5378.patch
        88 kB
        Shalin Shekhar Mangar
      2. SOLR-5378.patch
        86 kB
        Areek Zillur
      3. SOLR-5378.patch
        90 kB
        Varun Thacker
      4. SOLR-5378.patch
        87 kB
        Areek Zillur
      5. SOLR-5378.patch
        87 kB
        Areek Zillur
      6. SOLR-5378.patch
        86 kB
        Areek Zillur
      7. SOLR-5378.patch
        85 kB
        Areek Zillur
      8. SOLR-5378.patch
        85 kB
        Areek Zillur
      9. SOLR-5378.patch
        84 kB
        Areek Zillur
      10. SOLR-5378.patch
        85 kB
        Areek Zillur
      11. SOLR-5378.patch
        77 kB
        Areek Zillur
      12. SOLR-5378.patch
        63 kB
        Areek Zillur
      13. SOLR-5378.patch
        67 kB
        Areek Zillur
      14. SOLR-5378-maven-fix.patch
        1 kB
        Shalin Shekhar Mangar

        Issue Links

          Activity

          Hide
          Areek Zillur added a comment -

          Initial Patch:

          • implements SuggestComponent and SolrSuggester (along with classes for results, params, etc)

          TODO:

          • add more tests
          • fix hard coded defaults
          • add documentation
          Show
          Areek Zillur added a comment - Initial Patch: implements SuggestComponent and SolrSuggester (along with classes for results, params, etc) TODO: add more tests fix hard coded defaults add documentation
          Hide
          Areek Zillur added a comment -

          Note: the dictionary implementations still have spellcheck parameters in the tests, will fix in next patch.

          Show
          Areek Zillur added a comment - Note: the dictionary implementations still have spellcheck parameters in the tests, will fix in next patch.
          Hide
          Areek Zillur added a comment -

          Patch:

          • nuked irreverent Dictionary Factories for suggesters (PlainTextDictionaryFactory and LuceneDictionaryFactory (replaced by more general DocumentDictionaryFactory)
          • fixed tests for Dictionary Factories
          • added Documentation
          • fixed hard-coded strings and defaults

          TODO:

          • add more documentation (SuggesterParams)
          • add examples
          • add tests for the SolrSuggester and SuggesterComponent
          Show
          Areek Zillur added a comment - Patch: nuked irreverent Dictionary Factories for suggesters (PlainTextDictionaryFactory and LuceneDictionaryFactory (replaced by more general DocumentDictionaryFactory) fixed tests for Dictionary Factories added Documentation fixed hard-coded strings and defaults TODO: add more documentation (SuggesterParams) add examples add tests for the SolrSuggester and SuggesterComponent
          Hide
          Areek Zillur added a comment - - edited

          Uploaded Patch:

          • added documentation
          • added expressions dependency to Solr [needed by DocumentExpressionDictionaryFactory]
          • added SuggesterComponent tests
          • minor fixes all over

          TODO:

          • add DistributedSuggesterComponent tests

          Was wondering whether its worth it to have the onlyMorePopular option for the suggester. (default is false), when true the option sorts the suggestions alphabetically at the moment. It seems to me that this might not be very useful for suggestions (as there is no relation between lexicographic ordering and suggestion relevance). It makes sense to have the suggestions sorted by weights provided instead (default behavior). It is also to be noted that a lot of the new Lucene suggester rejects or ignores this option anyway, defaulting to sorting by weight instead (Analyzing, FuzzyAnalyzing rejects it and AnalyzingInfix ignores it).

          Show
          Areek Zillur added a comment - - edited Uploaded Patch: added documentation added expressions dependency to Solr [needed by DocumentExpressionDictionaryFactory] added SuggesterComponent tests minor fixes all over TODO: add DistributedSuggesterComponent tests Was wondering whether its worth it to have the onlyMorePopular option for the suggester. (default is false), when true the option sorts the suggestions alphabetically at the moment. It seems to me that this might not be very useful for suggestions (as there is no relation between lexicographic ordering and suggestion relevance). It makes sense to have the suggestions sorted by weights provided instead (default behavior). It is also to be noted that a lot of the new Lucene suggester rejects or ignores this option anyway, defaulting to sorting by weight instead (Analyzing, FuzzyAnalyzing rejects it and AnalyzingInfix ignores it).
          Hide
          Areek Zillur added a comment -

          Updated Patch:

          • Added distributed tests [completed all functional requirements]
          • minor changes all over
          • Nuked onlyMorePopular [always defaults to false]
          • Added statistics to component to be visible by admin page

          TODO:

          Show
          Areek Zillur added a comment - Updated Patch: Added distributed tests [completed all functional requirements] minor changes all over Nuked onlyMorePopular [always defaults to false] Added statistics to component to be visible by admin page TODO: polish test cases comment in sizeInBytes for SuggestComponent ( https://issues.apache.org/jira/browse/LUCENE-5323 )
          Hide
          Areek Zillur added a comment -

          Updated Patch:

          • added sizeInBytes for individual suggester and the suggester component
          • minor refactoring and cleanup
          • added documentation
          Show
          Areek Zillur added a comment - Updated Patch: added sizeInBytes for individual suggester and the suggester component minor refactoring and cleanup added documentation
          Hide
          Alexey Serba added a comment -

          In addition to this, this suggester version should also have support for distributed support, which was awkward at best with the previous implementation due to SpellCheck requirements.

          It seems that it is required to have query component registered in a request handler for suggester component to work correctly in distributed mode. Try to change last-components=suggest to components=suggest in solrconfig and I believe it won't work.

          I'm wondering what are the performance implications of having query (and others) components in a suggest request handler?

          Show
          Alexey Serba added a comment - In addition to this, this suggester version should also have support for distributed support, which was awkward at best with the previous implementation due to SpellCheck requirements. It seems that it is required to have query component registered in a request handler for suggester component to work correctly in distributed mode. Try to change last-components=suggest to components=suggest in solrconfig and I believe it won't work. I'm wondering what are the performance implications of having query (and others) components in a suggest request handler?
          Hide
          Areek Zillur added a comment -

          Updated Patch:

          • Added (independent) Distributed support for suggester
          • new test cases
          Show
          Areek Zillur added a comment - Updated Patch: Added (independent) Distributed support for suggester new test cases
          Hide
          Areek Zillur added a comment -

          Alexey Serba: Thanks for the insightful comment!
          The suggester should work independently (without any other component) in the distributed case. I added a new patch which will allow the suggester component to do so.

          Show
          Areek Zillur added a comment - Alexey Serba: Thanks for the insightful comment! The suggester should work independently (without any other component) in the distributed case. I added a new patch which will allow the suggester component to do so.
          Hide
          Shalin Shekhar Mangar added a comment -

          This is looking awesome Areek! Thanks for contributing this. I'm going to spend some time this week going over this.

          Show
          Shalin Shekhar Mangar added a comment - This is looking awesome Areek! Thanks for contributing this. I'm going to spend some time this week going over this.
          Hide
          Areek Zillur added a comment -

          Thanks for taking the time to look through this, Shalin! Let me know if the current patch can be improved in anyway.

          Show
          Areek Zillur added a comment - Thanks for taking the time to look through this, Shalin! Let me know if the current patch can be improved in anyway.
          Hide
          Areek Zillur added a comment -

          Updated Patch:

          • added fileDelimiter parameter (LUCENE-5337)
          • added null checks
          • minor fixes

          Hey Shalin, was wondering if you got a chance to look at the patch? I did some minor updates to the patch to add the newly-added lucene functionality

          Show
          Areek Zillur added a comment - Updated Patch: added fileDelimiter parameter ( LUCENE-5337 ) added null checks minor fixes Hey Shalin, was wondering if you got a chance to look at the patch? I did some minor updates to the patch to add the newly-added lucene functionality
          Hide
          Shalin Shekhar Mangar added a comment -

          Hi Areek, yes, I'm reviewing your patch. It is just taking more time than I thought it would. I really like what I am seeing so far. This is quality work.

          Show
          Shalin Shekhar Mangar added a comment - Hi Areek, yes, I'm reviewing your patch. It is just taking more time than I thought it would. I really like what I am seeing so far. This is quality work.
          Hide
          Areek Zillur added a comment -

          Thanks! thats awesome. Let me know if you think the patch can be improved in any way!

          Show
          Areek Zillur added a comment - Thanks! thats awesome. Let me know if you think the patch can be improved in any way!
          Hide
          Areek Zillur added a comment -

          Regarding the backward compatibility stuff for lookups (in SolrSuggester.init). Do you think it make sense to keep it? I am thinking because its a new component altogether, can we just get rid of it and set a sane default?

          Show
          Areek Zillur added a comment - Regarding the backward compatibility stuff for lookups (in SolrSuggester.init). Do you think it make sense to keep it? I am thinking because its a new component altogether, can we just get rid of it and set a sane default?
          Hide
          Shalin Shekhar Mangar added a comment -

          Regarding the backward compatibility stuff for lookups (in SolrSuggester.init). Do you think it make sense to keep it? I am thinking because its a new component altogether, can we just get rid of it and set a sane default?

          I agree. Let's not carry over cruft to a new component. A sane default is fine.

          Show
          Shalin Shekhar Mangar added a comment - Regarding the backward compatibility stuff for lookups (in SolrSuggester.init). Do you think it make sense to keep it? I am thinking because its a new component altogether, can we just get rid of it and set a sane default? I agree. Let's not carry over cruft to a new component. A sane default is fine.
          Hide
          Shalin Shekhar Mangar added a comment -

          Minor change: I had to change sortFieldTypeStr.toLowerCase() in DocumentExpressionDictionaryFactory to sortFieldTypeStr.toLowerCase(Locale.ROOT) to make the forbidden-api check pass.

          Show
          Shalin Shekhar Mangar added a comment - Minor change: I had to change sortFieldTypeStr.toLowerCase() in DocumentExpressionDictionaryFactory to sortFieldTypeStr.toLowerCase(Locale.ROOT) to make the forbidden-api check pass.
          Hide
          Areek Zillur added a comment -

          Updated Patch:

          • adding sane Lookup default
          • added minor change to DocExprDictFactory (as suggested)
          Show
          Areek Zillur added a comment - Updated Patch: adding sane Lookup default added minor change to DocExprDictFactory (as suggested)
          Hide
          Areek Zillur added a comment -

          Added docs for LookupFactory (in previous patch). Ensured ant validate && ant documentation-lint succeds.

          Show
          Areek Zillur added a comment - Added docs for LookupFactory (in previous patch). Ensured ant validate && ant documentation-lint succeds.
          Hide
          Varun Thacker added a comment -

          I was going through the patch and in DocumentExpressionDictionaryFactory the user has to enter the field type along with the field name. For example,

                <lst name="sortField">
                	<str name="name">weight</str>
                	<str name="type">float</str>
                </lst>
          

          Is asking for field type necessary?
          We could do something like this instead

          String sortFieldTypeStr = core.getLatestSchema().getField(sortFieldName).getType().getTypeName();
          Show
          Varun Thacker added a comment - I was going through the patch and in DocumentExpressionDictionaryFactory the user has to enter the field type along with the field name. For example, <lst name="sortField"> <str name="name">weight</str> <str name="type">float</str> </lst> Is asking for field type necessary? We could do something like this instead String sortFieldTypeStr = core.getLatestSchema().getField(sortFieldName).getType().getTypeName();
          Hide
          Varun Thacker added a comment -

          Also I hit a NPE with

          ant test  -Dtestcase=TestLuceneMatchVersion -Dtests.method=testStandardTokenizerVersions
          

          This is because schema-luceneMatchVersion.xml doesn't not contain "text" fieldType and FuzzyLookupFactory.create assumes that "suggestAnalyzerFieldType" field type exists.

          Show
          Varun Thacker added a comment - Also I hit a NPE with ant test -Dtestcase=TestLuceneMatchVersion -Dtests.method=testStandardTokenizerVersions This is because schema-luceneMatchVersion.xml doesn't not contain "text" fieldType and FuzzyLookupFactory.create assumes that "suggestAnalyzerFieldType" field type exists.
          Hide
          Areek Zillur added a comment - - edited

          Updated Patch:

          • added checks in *AnalyzingLookupFactories to ensure that the fieldType referred by "suggestAnalyzerFieldType" is defined in the schema [also affects the suggester in spellchecker]
          • use solrconfig-suggestercomponent for distributed and normal tests, rather than polluting the solrconfig file used throughout. [makes sure no arbitrary tests fail because of using the config with another schema]
          Show
          Areek Zillur added a comment - - edited Updated Patch: added checks in *AnalyzingLookupFactories to ensure that the fieldType referred by "suggestAnalyzerFieldType" is defined in the schema [also affects the suggester in spellchecker] use solrconfig-suggestercomponent for distributed and normal tests, rather than polluting the solrconfig file used throughout. [makes sure no arbitrary tests fail because of using the config with another schema]
          Hide
          Areek Zillur added a comment -

          Thanks Varun for your comments!

          I think its better to be explicit in the case of DocExprDictFactory and let the users specify the fieldtype for all the bindings in the expressions they use. I was aware of the suggested method of pulling the fieldTypeName but the problem is if a user configures it such that the name of the fieldType (expl: double field) as myDouble, then there would be no easy way of figuring out what field type to use in the sortFieldType in (DocExprDict) [myDouble != double].

          Regarding the NPE, thanks for catching that! I have updated the patch such that no such NPEs occur in "unrelated" tests.

          Show
          Areek Zillur added a comment - Thanks Varun for your comments! I think its better to be explicit in the case of DocExprDictFactory and let the users specify the fieldtype for all the bindings in the expressions they use. I was aware of the suggested method of pulling the fieldTypeName but the problem is if a user configures it such that the name of the fieldType (expl: double field) as myDouble, then there would be no easy way of figuring out what field type to use in the sortFieldType in (DocExprDict) [myDouble != double] . Regarding the NPE, thanks for catching that! I have updated the patch such that no such NPEs occur in "unrelated" tests.
          Hide
          Areek Zillur added a comment -

          Updated patch (minor):

          • added configuration error message for HighFrequencyDictionaryFactory
          Show
          Areek Zillur added a comment - Updated patch (minor): added configuration error message for HighFrequencyDictionaryFactory
          Hide
          Varun Thacker added a comment -

          Hi Areek,

          Maybe I wasn't clear with how we can avoid the user to put in the "type" of the fieldName required for sort when using DocumentExpressionDictionaryFactory

          I have uploaded a patch where providing "type" is not required.

          This should be okay right?

          Show
          Varun Thacker added a comment - Hi Areek, Maybe I wasn't clear with how we can avoid the user to put in the "type" of the fieldName required for sort when using DocumentExpressionDictionaryFactory I have uploaded a patch where providing "type" is not required. This should be okay right?
          Hide
          Varun Thacker added a comment -

          First time attempt at a patch using git.
          I committed all the changes and did a

          git format-patch -1 <sha>

          Please let me know if it does not apply cleanly.

          Show
          Varun Thacker added a comment - First time attempt at a patch using git. I committed all the changes and did a git format-patch -1 <sha> Please let me know if it does not apply cleanly.
          Hide
          Areek Zillur added a comment -

          Updated Patch:

          • Incorporated Varun's changes (Thanks Varun)
          • Simplified DocExpDict config
          • Minor refactoring
          Show
          Areek Zillur added a comment - Updated Patch: Incorporated Varun's changes (Thanks Varun) Simplified DocExpDict config Minor refactoring
          Hide
          Areek Zillur added a comment -

          Thanks Varun for the improvement! It makes total sense [for now will leave out the 'score' type for sortFields].
          FYI: I usually do

          git diff --no-prefix trunk > SOLR-5378.patch

          from the git branch I work in (that way its friendlier to svn)

          Updated Description with the new DocumentExpressionDictionaryConfig.

          Show
          Areek Zillur added a comment - Thanks Varun for the improvement! It makes total sense [for now will leave out the 'score' type for sortFields] . FYI: I usually do git diff --no-prefix trunk > SOLR-5378.patch from the git branch I work in (that way its friendlier to svn) Updated Description with the new DocumentExpressionDictionaryConfig.
          Hide
          Areek Zillur added a comment -

          NOTE: There is a silent IllegalArgumentException thrown from Lucene's DocumentDictionary & DocumentExpressionDictionary when the dictionaries are fed with readers with no documents. This should not be a big issue as nothing is expected to happen anyways when a suggester is built with no documents. LUCENE-5329 will fix this issue, once its checked in.

          Show
          Areek Zillur added a comment - NOTE: There is a silent IllegalArgumentException thrown from Lucene's DocumentDictionary & DocumentExpressionDictionary when the dictionaries are fed with readers with no documents. This should not be a big issue as nothing is expected to happen anyways when a suggester is built with no documents. LUCENE-5329 will fix this issue, once its checked in.
          Hide
          Shalin Shekhar Mangar added a comment -

          This patch removes some unused imports. I'll commit this shortly.

          Show
          Shalin Shekhar Mangar added a comment - This patch removes some unused imports. I'll commit this shortly.
          Hide
          ASF subversion and git services added a comment -

          Commit 1544793 from shalin@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1544793 ]

          SOLR-5378: A new SuggestComponent that fully utilizes the Lucene suggester module and adds pluggable dictionaries, payloads and better distributed support

          Show
          ASF subversion and git services added a comment - Commit 1544793 from shalin@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1544793 ] SOLR-5378 : A new SuggestComponent that fully utilizes the Lucene suggester module and adds pluggable dictionaries, payloads and better distributed support
          Hide
          ASF subversion and git services added a comment -

          Commit 1544796 from shalin@apache.org in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1544796 ]

          SOLR-5378: A new SuggestComponent that fully utilizes the Lucene suggester module and adds pluggable dictionaries, payloads and better distributed support

          Show
          ASF subversion and git services added a comment - Commit 1544796 from shalin@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1544796 ] SOLR-5378 : A new SuggestComponent that fully utilizes the Lucene suggester module and adds pluggable dictionaries, payloads and better distributed support
          Hide
          ASF subversion and git services added a comment -

          Commit 1544815 from shalin@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1544815 ]

          SOLR-5378: Fix compile issues on Java6

          Show
          ASF subversion and git services added a comment - Commit 1544815 from shalin@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1544815 ] SOLR-5378 : Fix compile issues on Java6
          Hide
          ASF subversion and git services added a comment -

          Commit 1544816 from shalin@apache.org in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1544816 ]

          SOLR-5378: Fix compile issues on Java6

          Show
          ASF subversion and git services added a comment - Commit 1544816 from shalin@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1544816 ] SOLR-5378 : Fix compile issues on Java6
          Hide
          Areek Zillur added a comment -

          Thanks Shalin for committing this!

          Show
          Areek Zillur added a comment - Thanks Shalin for committing this!
          Hide
          Steve Rowe added a comment - - edited

          The failures yesterday of the Maven Jenkins build on trunk (https://builds.apache.org/job/Lucene-Solr-Maven-trunk/1033/) and branch_4x (https://builds.apache.org/job/Lucene-Solr-Maven-4.x/514/) are caused by missing license files for newly introduced indirect compile-time Solr dependencies on the expression module's dependencies in lucene/expressions/lib/*: antlr-runtime-3.5.jar, asm-4.1.jar, and asm-commons-4.1.jar.

          Here's where the indirect deps were introduced on trunk: https://svn.apache.org/viewvc/lucene/dev/trunk/solr/common-build.xml?r1=1544793&r2=1544792&pathrev=1544793&diff_format=f#l78.

          Here's the failure message from the Jenkins Maven trunk build log:

           [licenses] MISSING sha1 checksum file for: /home/hudson/.m2/repository/org/antlr/antlr-runtime/3.5/antlr-runtime-3.5.jar
           [licenses] EXPECTED sha1 checksum file : /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/solr/licenses/antlr-runtime-3.5.jar.sha1
           [licenses] MISSING sha1 checksum file for: /home/hudson/.m2/repository/org/ow2/asm/asm-commons/4.1/asm-commons-4.1.jar
           [licenses] EXPECTED sha1 checksum file : /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/solr/licenses/asm-commons-4.1.jar.sha1
           [licenses] MISSING sha1 checksum file for: /home/hudson/.m2/repository/org/ow2/asm/asm/4.1/asm-4.1.jar
           [licenses] EXPECTED sha1 checksum file : /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/solr/licenses/asm-4.1.jar.sha1
           [licenses] Scanned 46 JAR file(s) for licenses (in 0.55s.), 3 error(s).
          

          The Maven build problem is the assumption that compile-time dependencies in the Ant build will be Maven runtime dependencies. When I run ant prepare-release-no-sign under solr/, though, only lucene-expressions-<version>.jar is included in the distribution - none of its dependencies are.

          So, assuming that usage of the expressions module requires inclusion of its dependencies, I think there are three problems here:

          1. The expression module dependencies should be included in the Solr distribution
          2. The expression module dependencies' license, notice and checksum files should be included under solr/licenses/
          3. The check-licenses task under solr/, as currently written, assumes that all dependencies are located under solr/, so using lib/ directories' contents under lucene/ from the solr/ build blocks this check from functioning properly. I think these indirect dependencies should be made direct via inclusion in an ivy.xml file under solr/, not sure which one is appropriate.
          Show
          Steve Rowe added a comment - - edited The failures yesterday of the Maven Jenkins build on trunk ( https://builds.apache.org/job/Lucene-Solr-Maven-trunk/1033/ ) and branch_4x ( https://builds.apache.org/job/Lucene-Solr-Maven-4.x/514/ ) are caused by missing license files for newly introduced indirect compile-time Solr dependencies on the expression module's dependencies in lucene/expressions/lib/* : antlr-runtime-3.5.jar , asm-4.1.jar , and asm-commons-4.1.jar . Here's where the indirect deps were introduced on trunk: https://svn.apache.org/viewvc/lucene/dev/trunk/solr/common-build.xml?r1=1544793&r2=1544792&pathrev=1544793&diff_format=f#l78 . Here's the failure message from the Jenkins Maven trunk build log: [licenses] MISSING sha1 checksum file for: /home/hudson/.m2/repository/org/antlr/antlr-runtime/3.5/antlr-runtime-3.5.jar [licenses] EXPECTED sha1 checksum file : /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/solr/licenses/antlr-runtime-3.5.jar.sha1 [licenses] MISSING sha1 checksum file for: /home/hudson/.m2/repository/org/ow2/asm/asm-commons/4.1/asm-commons-4.1.jar [licenses] EXPECTED sha1 checksum file : /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/solr/licenses/asm-commons-4.1.jar.sha1 [licenses] MISSING sha1 checksum file for: /home/hudson/.m2/repository/org/ow2/asm/asm/4.1/asm-4.1.jar [licenses] EXPECTED sha1 checksum file : /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/solr/licenses/asm-4.1.jar.sha1 [licenses] Scanned 46 JAR file(s) for licenses (in 0.55s.), 3 error(s). The Maven build problem is the assumption that compile-time dependencies in the Ant build will be Maven runtime dependencies. When I run ant prepare-release-no-sign under solr/ , though, only lucene-expressions-<version>.jar is included in the distribution - none of its dependencies are. So, assuming that usage of the expressions module requires inclusion of its dependencies, I think there are three problems here: The expression module dependencies should be included in the Solr distribution The expression module dependencies' license, notice and checksum files should be included under solr/licenses/ The check-licenses task under solr/ , as currently written, assumes that all dependencies are located under solr/ , so using lib/ directories' contents under lucene/ from the solr/ build blocks this check from functioning properly. I think these indirect dependencies should be made direct via inclusion in an ivy.xml file under solr/ , not sure which one is appropriate.
          Hide
          Shalin Shekhar Mangar added a comment -
          1. Added dependencies of lucene expressions module viz asm, asm-commons and antlr to Solr's ivy.xml
          2. Copied over sha, license and notice files for the dependencies to solr's license directory
          Show
          Shalin Shekhar Mangar added a comment - Added dependencies of lucene expressions module viz asm, asm-commons and antlr to Solr's ivy.xml Copied over sha, license and notice files for the dependencies to solr's license directory
          Hide
          ASF subversion and git services added a comment -

          Commit 1545311 from shalin@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1545311 ]

          SOLR-5378: Add dependencies of lucene's expressions module to Solr

          Show
          ASF subversion and git services added a comment - Commit 1545311 from shalin@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1545311 ] SOLR-5378 : Add dependencies of lucene's expressions module to Solr
          Hide
          ASF subversion and git services added a comment -

          Commit 1545313 from shalin@apache.org in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1545313 ]

          SOLR-5378: Add dependencies of lucene's expressions module to Solr

          Show
          ASF subversion and git services added a comment - Commit 1545313 from shalin@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1545313 ] SOLR-5378 : Add dependencies of lucene's expressions module to Solr
          Hide
          Steve Rowe added a comment -
          1. Added dependencies of lucene expressions module viz asm, asm-commons and antlr to Solr's ivy.xml
          2. Copied over sha, license and notice files for the dependencies to solr's license directory

          +1, thanks Shalin.

          Also, the following line should be removed from the definition of solr.lucene.libs in solr/common-build.xml:

          <fileset dir="${common.dir}/expressions/lib"/>
          

          Now that these three jars are placed in solr/core/lib/ via resolve, the Solr war will pick them up, and so they'll be included in the Solr distribution.

          Show
          Steve Rowe added a comment - Added dependencies of lucene expressions module viz asm, asm-commons and antlr to Solr's ivy.xml Copied over sha, license and notice files for the dependencies to solr's license directory +1, thanks Shalin. Also, the following line should be removed from the definition of solr.lucene.libs in solr/common-build.xml : <fileset dir= "${common.dir}/expressions/lib" /> Now that these three jars are placed in solr/core/lib/ via resolve , the Solr war will pick them up, and so they'll be included in the Solr distribution.
          Hide
          ASF subversion and git services added a comment -

          Commit 1545318 from shalin@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1545318 ]

          SOLR-5378: lucene expressions module's lib directory is not necessary in solr.lucene.libs path

          Show
          ASF subversion and git services added a comment - Commit 1545318 from shalin@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1545318 ] SOLR-5378 : lucene expressions module's lib directory is not necessary in solr.lucene.libs path
          Hide
          ASF subversion and git services added a comment -

          Commit 1545319 from shalin@apache.org in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1545319 ]

          SOLR-5378: lucene expressions module's lib directory is not necessary in solr.lucene.libs path

          Show
          ASF subversion and git services added a comment - Commit 1545319 from shalin@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1545319 ] SOLR-5378 : lucene expressions module's lib directory is not necessary in solr.lucene.libs path
          Hide
          Shalin Shekhar Mangar added a comment -

          Thanks Steve for figuring out the build failure and suggesting the fix!

          Areek, thanks for all the work! I'll resolve this issue for now and open another to document this feature in the Solr reference guide for 4.7

          Show
          Shalin Shekhar Mangar added a comment - Thanks Steve for figuring out the build failure and suggesting the fix! Areek, thanks for all the work! I'll resolve this issue for now and open another to document this feature in the Solr reference guide for 4.7
          Hide
          Areek Zillur added a comment -

          Thanks Steve and Shalin for figuring out the dependency/licensing issues!

          Shalin, I can prepare the documentation for the Solr ref guide! I will open an issue for it and add a 'patch' for it.

          Show
          Areek Zillur added a comment - Thanks Steve and Shalin for figuring out the dependency/licensing issues! Shalin, I can prepare the documentation for the Solr ref guide! I will open an issue for it and add a 'patch' for it.
          Hide
          Shalin Shekhar Mangar added a comment -

          Sounds good, thanks Areek!

          Show
          Shalin Shekhar Mangar added a comment - Sounds good, thanks Areek!
          Hide
          Greg Harris added a comment -

          Just made a new JIRA for the fact that this suggestor doesn't support multiValued fields. It will only grab the first entry to make terms.

          JIRA is here:
          https://issues.apache.org/jira/browse/SOLR-6210

          Show
          Greg Harris added a comment - Just made a new JIRA for the fact that this suggestor doesn't support multiValued fields. It will only grab the first entry to make terms. JIRA is here: https://issues.apache.org/jira/browse/SOLR-6210
          Hide
          Mark Bennett added a comment -

          Sorry to ask such a "dumb" question, but in the intro description it lists one of the goals as "... this suggester version should also have support for distributed support, which was awkward at best with the previous implementation due to SpellCheck requirements."

          But then the distributed example still has &shards= and shirts.qt=

          That looks about the same as the old SpellCheckComponent based stuff on http://wiki.apache.org/solr/SpellCheckComponent#Distributed_Search_Support

          So a bit confused, maybe the shard stuff is now optional?

          Show
          Mark Bennett added a comment - Sorry to ask such a "dumb" question, but in the intro description it lists one of the goals as "... this suggester version should also have support for distributed support, which was awkward at best with the previous implementation due to SpellCheck requirements." But then the distributed example still has &shards= and shirts.qt= That looks about the same as the old SpellCheckComponent based stuff on http://wiki.apache.org/solr/SpellCheckComponent#Distributed_Search_Support So a bit confused, maybe the shard stuff is now optional?
          Hide
          Shalin Shekhar Mangar added a comment -

          Mark, the shards param is definitely optional but shards.qt will still be required.

          Show
          Shalin Shekhar Mangar added a comment - Mark, the shards param is definitely optional but shards.qt will still be required.

            People

            • Assignee:
              Shalin Shekhar Mangar
              Reporter:
              Areek Zillur
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development