Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-3589

Edismax parser does not honor mm parameter if analyzer splits a token

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.6, 4.0-BETA
    • 3.6.2, 4.1, 6.0
    • search
    • None

    Description

      With edismax mm set to 100% if one of the tokens is split into two tokens by the analyzer chain (i.e. "fire-fly" => fire fly), the mm parameter is ignored and the equivalent of OR query for "fire OR fly" is produced.
      This is particularly a problem for languages that do not use white space to separate words such as Chinese or Japenese.

      See these messages for more discussion:
      http://lucene.472066.n3.nabble.com/edismax-parser-ignores-mm-parameter-when-tokenizer-splits-tokens-hypenated-words-WDF-splitting-etc-tc3991911.html

      http://lucene.472066.n3.nabble.com/edismax-parser-ignores-mm-parameter-when-tokenizer-splits-tokens-i-e-CJK-tc3991438.html

      http://lucene.472066.n3.nabble.com/Why-won-t-dismax-create-multiple-DisjunctionMaxQueries-when-autoGeneratePhraseQueries-is-false-tc3992109.html

      Attachments

        1. testSolr3589.xml.gz
          1 kB
          Tom Burton-West
        2. testSolr3589.xml.gz
          1 kB
          Tom Burton-West
        3. SOLR-3589_test.patch
          1 kB
          Robert Muir
        4. SOLR-3589.patch
          3 kB
          Robert Muir
        5. SOLR-3589.patch
          5 kB
          Robert Muir
        6. SOLR-3589.patch
          8 kB
          Robert Muir
        7. SOLR-3589.patch
          8 kB
          Robert Muir
        8. SOLR-3589.patch
          10 kB
          Tom Burton-West
        9. SOLR-3589-3.6.PATCH
          11 kB
          Tom Burton-West

        Issue Links

          Activity

            People

              rcmuir Robert Muir
              tburtonwest Tom Burton-West
              Votes:
              4 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: