Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 4.9, Trunk
    • Component/s: Schema and Analysis
    • Labels:
      None

      Description

      ApostropheTokenizer creates extra tokens during the analysis stage for the fields containing apostrophes. The reason for adding this is to ensure that documents that differ only by apostrophe have the same relevancy score.

      For example, if the document contains string "McDonald's", it will be tokenized as "McDonald's McDonalds". This way when the search is performed against "McDonald's" or "McDonalds" will produce similar score.

      This code handles up to two apostrophes in a token.

      To use this tokenizer add the following line in schema.xml

      <analyzer type="index">
      <filter class="org.apache.lucene.analysis.ApostropheTokenFactory"/>
      ...
      </analyzer>

        Activity

        Sergey Borisov created issue -
        Sergey Borisov made changes -
        Field Original Value New Value
        Attachment ApostropheTokenizer.zip [ 12413459 ]
        Sergey Borisov made changes -
        Fix Version/s 1.4 [ 12313351 ]
        Noble Paul made changes -
        Fix Version/s 1.5 [ 12313566 ]
        Fix Version/s 1.4 [ 12313351 ]
        Hoss Man made changes -
        Fix Version/s Next [ 12315093 ]
        Fix Version/s 1.5 [ 12313566 ]
        Hoss Man made changes -
        Fix Version/s 3.2 [ 12316172 ]
        Fix Version/s Next [ 12315093 ]
        Robert Muir made changes -
        Fix Version/s 3.3 [ 12316471 ]
        Fix Version/s 3.2 [ 12316172 ]
        Robert Muir made changes -
        Fix Version/s 3.4 [ 12316683 ]
        Fix Version/s 4.0 [ 12314992 ]
        Fix Version/s 3.3 [ 12316471 ]
        Robert Muir made changes -
        Fix Version/s 3.5 [ 12317876 ]
        Fix Version/s 3.4 [ 12316683 ]
        Simon Willnauer made changes -
        Fix Version/s 3.6 [ 12319065 ]
        Fix Version/s 3.5 [ 12317876 ]
        Hoss Man made changes -
        Fix Version/s 3.6 [ 12319065 ]
        Robert Muir made changes -
        Fix Version/s 4.1 [ 12321141 ]
        Fix Version/s 4.0 [ 12314992 ]
        Mark Miller made changes -
        Fix Version/s 4.2 [ 12323893 ]
        Fix Version/s 5.0 [ 12321664 ]
        Fix Version/s 4.1 [ 12321141 ]
        Robert Muir made changes -
        Fix Version/s 4.3 [ 12324128 ]
        Fix Version/s 5.0 [ 12321664 ]
        Fix Version/s 4.2 [ 12323893 ]
        Uwe Schindler made changes -
        Fix Version/s 4.4 [ 12324324 ]
        Fix Version/s 4.3 [ 12324128 ]
        Steve Rowe made changes -
        Fix Version/s 5.0 [ 12321664 ]
        Fix Version/s 4.5 [ 12324743 ]
        Fix Version/s 4.4 [ 12324324 ]
        Adrien Grand made changes -
        Fix Version/s 4.6 [ 12325000 ]
        Fix Version/s 5.0 [ 12321664 ]
        Fix Version/s 4.5 [ 12324743 ]
        Uwe Schindler made changes -
        Fix Version/s 4.7 [ 12325573 ]
        Fix Version/s 4.6 [ 12325000 ]
        David Smiley made changes -
        Fix Version/s 4.8 [ 12326254 ]
        Fix Version/s 4.7 [ 12325573 ]
        Uwe Schindler made changes -
        Fix Version/s 4.9 [ 12326731 ]
        Fix Version/s 5.0 [ 12321664 ]
        Fix Version/s 4.8 [ 12326254 ]

          People

          • Assignee:
            Unassigned
            Reporter:
            Sergey Borisov
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development