Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-12532

Slop specified in query string is not preserved for certain phrase searches

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 7.4
    • Fix Version/s: None
    • Component/s: query parsers
    • Labels:
      None

      Description

      Note: This only impacts specific settings for the WordDelimiterGraphFilter as detailed below.

      When a phrase search is parsed by the SolrQueryParser, and the phrase search results in a graph token stream, the resulting SpanNearQuery created does not have the slop correctly set.

      Conditions

      • Slop provided in query string (ex: ~2")
      • WordDelimiterGraphFilterFactory with query time preserveOriginal and generateWordParts
      • query string includes a term that contains a word delimiter

      Example

      Field: wdf_partspreserve
      – WordDelimiterGraphFilterFactory
      ---- preserveOriginal="1"
      ---- generateWordParts="1"

      Data: you just can't
      Search: wdf_partspreserve:"you can't"~2 -> 0 Results

      Cause

      The slop supplied by the query string is applied in SolrQueryParserBase#getFieldQuery which will set the slop only for PhraseQuery and MultiPhaseQuery. Since "can't" will be broken down into multiple tokens analyzeGraphPhrase will be triggered when the Query is being constructed which will return a SpanNearQuery instead of a (Multi)PhraseQuery.

        Attachments

        1. SOLR-12532.patch
          5 kB
          Brad Sumersford

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                bladenkerst Brad Sumersford
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m