Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-4991

QueryParser doesnt handle synonyms correctly for chinese

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.3.1, 6.0
    • Component/s: modules/queryparser
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      As reported multiple times on the user list:
      http://find.searchhub.org/document/eaf0e88a6a0d4d1f
      http://find.searchhub.org/document/abf28043c52b6efc
      http://find.searchhub.org/document/1313794632c90826

      The logic here is not forming the right query structures and ignoring positionIncrementAttribute from the tokenStream.

      • when default operator is AND, you can see it more clearly, as synonyms are wrongly inserted as additional MUST terms:
        expected:<+field:中 +(field:国 field:國)>
        but was:<+field:中 +field:国 +field:國>
      • even when default operator is OR, its still wrong, because we ignore posInc and this means coord computation is not correct (so scoring is wrong)

      This also screws up scoring and queries for decompounding too (because they go thru this exact situation if they add the original compound as a synonym).

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              rcmuir Robert Muir
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: