Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7695

Unknown query type SynonymQuery in ComplexPhraseQueryParser

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 6.4
    • Fix Version/s: 7.0, 6.5
    • Component/s: core/queryparser
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      We sometimes receive this exception using ComplexPhraseQueryParser via Solr 6.4.0. Some terms do fine, others don't.

      This query:

      {!complexphrase}owmskern_title:"vergunning" 
      

      returns results just fine. The next one:

      {!complexphrase}owmskern_title:"vergunningen~"
      

      Gives results as well! But this one:

      {!complexphrase}owmskern_title:"vergunningen"
      

      Returns the following exception:

      IllegalArgumentException: Unknown query type "org.apache.lucene.search.SynonymQuery" found in phrase query string "algemene plaatselijke verordening"
              at org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser$ComplexPhraseQuery.rewrite(ComplexPhraseQueryParser.java:313)
              at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:265)
              at org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:684)
              at org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:734)
              at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:473)
              at org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:241)
              at org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1919)
              at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1636)
              at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:611)
              at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:533)
              at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:295)
      
      1. LUCENE-7695.patch
        7 kB
        Mikhail Khludnev
      2. LUCENE-7695.patch
        5 kB
        Markus Jelsma
      3. LUCENE-7695.patch
        3 kB
        Markus Jelsma
      4. LUCENE-7695.patch
        2 kB
        Markus Jelsma
      5. LUCENE-7695.patch
        2 kB
        Markus Jelsma

        Activity

        Hide
        arafalov Alexandre Rafalovitch added a comment -

        This would have been better discussed on the mailing list first, I feel.

        I suspect what might be happening here is that one of the terms is hitting synonym expansion and perhaps that is not supported. This is strengthened by the fact that the words in the exception do not match the word you gave triggering it.

        So, I would check the type definition, synonym file it uses and the synonyms in there. If I am right, the bigger question then is whether ComplexPhraseQueryParser is expected to support synonyms. If yes, then that would be the actual issue here.

        Show
        arafalov Alexandre Rafalovitch added a comment - This would have been better discussed on the mailing list first, I feel. I suspect what might be happening here is that one of the terms is hitting synonym expansion and perhaps that is not supported. This is strengthened by the fact that the words in the exception do not match the word you gave triggering it. So, I would check the type definition, synonym file it uses and the synonyms in there. If I am right, the bigger question then is whether ComplexPhraseQueryParser is expected to support synonyms. If yes, then that would be the actual issue here.
        Hide
        mkhludnev Mikhail Khludnev added a comment -

        CPQP transforms only certain queries to spans. So, the failure is obvious and patches are welcome.

        Show
        mkhludnev Mikhail Khludnev added a comment - CPQP transforms only certain queries to spans. So, the failure is obvious and patches are welcome.
        Hide
        markus17 Markus Jelsma added a comment -

        Hello Alexandre Rafalovitch,

        The terms i used in the examples do not have synonyms defined, actually, the synonym file is so far still empty. About the words not matching, you are right, i copy/pasted another exception, i was looking for words and word combinations that do and do not cause trouble. Apologies for the confusion.

        Thanks,
        Markus

        Show
        markus17 Markus Jelsma added a comment - Hello Alexandre Rafalovitch , The terms i used in the examples do not have synonyms defined, actually, the synonym file is so far still empty. About the words not matching, you are right, i copy/pasted another exception, i was looking for words and word combinations that do and do not cause trouble. Apologies for the confusion. Thanks, Markus
        Hide
        markus17 Markus Jelsma added a comment -

        I cannot seem to import stuff from Lucene's analysis module into a unit test that's in Lucene's queryparser module.

        E.g.

        import org.apache.lucene.analysis.synonym.SynonymFilter;
        import org.apache.lucene.analysis.synonym.SynonymMap;
        

        doesn't work in org.apache.lucene.queryparser.complexPhrase.TestComplexPhraseQuery. Any ideas on how to test it?

        Show
        markus17 Markus Jelsma added a comment - I cannot seem to import stuff from Lucene's analysis module into a unit test that's in Lucene's queryparser module. E.g. import org.apache.lucene.analysis.synonym.SynonymFilter; import org.apache.lucene.analysis.synonym.SynonymMap; doesn't work in org.apache.lucene.queryparser.complexPhrase.TestComplexPhraseQuery. Any ideas on how to test it?
        Hide
        mkhludnev Mikhail Khludnev added a comment -

        you can try to approach org.apache.lucene.analysis.MockSynonymAnalyzer in TestComplexPhraseQuery

        Show
        mkhludnev Mikhail Khludnev added a comment - you can try to approach org.apache.lucene.analysis.MockSynonymAnalyzer in TestComplexPhraseQuery
        Hide
        markus17 Markus Jelsma added a comment - - edited

        Patch for master. I cannot get the unit tests to run (ant keeps hanging here) so i applied a crude fix and tested it via Solr and it works.

        When processing the SynonymQuery i actually have no idea what it should do with more than 1 term, i think it should rewrite itself again but i am really not sure.

        Or should it create SpanTermQuery for each term and wrap those in a SpanOrQuery and add that to the list of allSpanClauses?

        Show
        markus17 Markus Jelsma added a comment - - edited Patch for master. I cannot get the unit tests to run (ant keeps hanging here) so i applied a crude fix and tested it via Solr and it works. When processing the SynonymQuery i actually have no idea what it should do with more than 1 term, i think it should rewrite itself again but i am really not sure. Or should it create SpanTermQuery for each term and wrap those in a SpanOrQuery and add that to the list of allSpanClauses?
        Hide
        markus17 Markus Jelsma added a comment -

        Here's a patch where each Term in the SynonymQuery is wrapped as SpanTermQuery in a SpanOrQuery, which is then added to the allSpanClauses array.

        If there is just one term in the SynonymQuery, it is added as a SpanTermQuery directly.

        This seems more appropriate, but don't take my word for it.

        Show
        markus17 Markus Jelsma added a comment - Here's a patch where each Term in the SynonymQuery is wrapped as SpanTermQuery in a SpanOrQuery, which is then added to the allSpanClauses array. If there is just one term in the SynonymQuery, it is added as a SpanTermQuery directly. This seems more appropriate, but don't take my word for it.
        Hide
        markus17 Markus Jelsma added a comment -

        It seems the top level query can also be a SynonymQuery, at least via Solr. Updated patch to take care of that as well but it seem i broke something as well. It is now no longer possible to embed FuzzyQuery:

        {!complexphrase}content_nl:"vergunningen~"
        

        Won't work anymore. But working with multiple terms on the same position does work now, e.g. KeepWordFilter with stemmed terms. I need to go, but will take a peek later.

        Show
        markus17 Markus Jelsma added a comment - It seems the top level query can also be a SynonymQuery, at least via Solr. Updated patch to take care of that as well but it seem i broke something as well. It is now no longer possible to embed FuzzyQuery: {!complexphrase}content_nl: "vergunningen~" Won't work anymore. But working with multiple terms on the same position does work now, e.g. KeepWordFilter with stemmed terms. I need to go, but will take a peek later.
        Hide
        markus17 Markus Jelsma added a comment -

        New patch, all SynonymQuery's are turned into a SpanOrQuery now and it works, as it seems.

        The next query works fine:

        {!complexphrase}content_nl:"(emissi* OR investerin*)"~30
        

        But this one doesn't:

        {!complexphrase}content_nl:"emissi*"
        

        Prefix or fuzzy queries both return a:

        null:java.lang.IllegalArgumentException: Unknown query type "org.apache.luc
        ene.search.PrefixQuery" found in phrase query string "emissi*"
                at org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser$ComplexPhraseQuery.rewrite(ComplexPhraseQueryParser.java:289)
        

        Haven't got a clue yet why this doesn't work, but have it wrapped in a boolean query does.

        Show
        markus17 Markus Jelsma added a comment - New patch, all SynonymQuery's are turned into a SpanOrQuery now and it works, as it seems. The next query works fine: {!complexphrase}content_nl: "(emissi* OR investerin*)" ~30 But this one doesn't: {!complexphrase}content_nl: "emissi*" Prefix or fuzzy queries both return a: null :java.lang.IllegalArgumentException: Unknown query type "org.apache.luc ene.search.PrefixQuery " found in phrase query string " emissi*" at org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser$ComplexPhraseQuery.rewrite(ComplexPhraseQueryParser.java:289) Haven't got a clue yet why this doesn't work, but have it wrapped in a boolean query does.
        Hide
        mkhludnev Mikhail Khludnev added a comment -

        what about LUCENE-7695.patch ?

        Show
        mkhludnev Mikhail Khludnev added a comment - what about LUCENE-7695.patch ?
        Hide
        markus17 Markus Jelsma added a comment -

        Hello Mikhail Khludnev, your patch works nicely!

        Show
        markus17 Markus Jelsma added a comment - Hello Mikhail Khludnev , your patch works nicely!
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 8a5492930eff393de824450f77f27d98a204df3d in lucene-solr's branch refs/heads/master from Mikhail Khludnev
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=8a54929 ]

        LUCENE-7695: support synonyms in ComplexPhraseQueryParser

        Show
        jira-bot ASF subversion and git services added a comment - Commit 8a5492930eff393de824450f77f27d98a204df3d in lucene-solr's branch refs/heads/master from Mikhail Khludnev [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=8a54929 ] LUCENE-7695 : support synonyms in ComplexPhraseQueryParser
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 7087acaedc053821ee2afc2a3ebe6ba14efbcf03 in lucene-solr's branch refs/heads/branch_6x from Mikhail Khludnev
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=7087aca ]

        LUCENE-7695: support synonyms in ComplexPhraseQueryParser

        Show
        jira-bot ASF subversion and git services added a comment - Commit 7087acaedc053821ee2afc2a3ebe6ba14efbcf03 in lucene-solr's branch refs/heads/branch_6x from Mikhail Khludnev [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=7087aca ] LUCENE-7695 : support synonyms in ComplexPhraseQueryParser
        Hide
        markus17 Markus Jelsma added a comment -

        Removed fix/version 6.4.2.
        Thanks Mikhail!

        Show
        markus17 Markus Jelsma added a comment - Removed fix/version 6.4.2. Thanks Mikhail!

          People

          • Assignee:
            Unassigned
            Reporter:
            markus17 Markus Jelsma
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development