Details

    • Type: Wish Wish
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.1.0
    • Component/s: search
    • Labels:
      None

      Description

      I recently noticed a situation in which my Query analyzer was producing the same Token more then once, resulting in it getting two equally boosted clauses in the resulting query. In my specific case, i was using the same synonym file for multiple fields (some stemmed some not) and two synonyms for a word stemmed to the same root, which ment that particular word was worth twice as as any of the other variations of the synonym – but I can imagine other situations where this might come up, both at index time and at query time, particularlay when using SynonymFilter in combination with the WordDelimiter filter.

      It occured to me that a DeDupFilter would be handy. In it's simplest form it would drop any Token it gets where the startOffset, endOffset,termText,and type are all identical to the previous token and the positionIncriment is 0. A more robust implimentation might support init options indicating that only certain combinations of those things should be used to determine equality (ie: just termText, just termText and positionIncriment=0, etc...) but in this case, an option might also be neccessary to determine with of the Tokens should be propogated (the first of the last)

        Issue Links

          Activity

          Hoss Man created issue -
          Richard "Trey" Hyde made changes -
          Field Original Value New Value
          Attachment solr.analysis.RemoveDuplicateTokensFilter.java [ 12326463 ]
          Richard "Trey" Hyde made changes -
          Hoss Man made changes -
          Hoss Man made changes -
          Assignee Hoss Man [ hossman ]
          Yonik Seeley made changes -
          Attachment ArrayQueue.java [ 12336388 ]
          Hoss Man made changes -
          Resolution Fixed [ 1 ]
          Status Open [ 1 ] Resolved [ 5 ]
          Hoss Man made changes -
          Link This issue is cloned as SOLR-26 [ SOLR-26 ]
          Hoss Man made changes -
          Fix Version/s 1.1.0 [ 12312234 ]
          Uwe Schindler made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            • Assignee:
              Hoss Man
              Reporter:
              Hoss Man
            • Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development