Lucene - Core
  1. Lucene - Core
  2. LUCENE-3849

position increments should be implemented by TokenStream.end()

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.6, 4.0-ALPHA
    • Fix Version/s: 4.5, Trunk
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      if you have pages of a book as multivalued fields, with the default position increment gap
      of analyzer.java (0), phrase queries won't work across pages if one ends with stopword(s).

      This is because the 'trailing holes' are not taken into account in end(). So I think in
      TokenStream.end(), subclasses of FilteringTokenFilter (e.g. stopfilter) should do:

      super.end();
      posIncAtt += skippedPositions;
      

      One problem is that these filters need to 'add' to the posinc, but currently nothing clears
      the attributes for end() [they are dirty, except offset which is set by the tokenizer].

      Also the indexer should be changed to pull posIncAtt from end().

      1. LUCENE-3849.patch
        7 kB
        Robert Muir
      2. LUCENE-3849.patch
        27 kB
        Robert Muir
      3. LUCENE-3849.patch
        31 kB
        Michael McCandless
      4. LUCENE-3849.patch
        34 kB
        Michael McCandless

        Issue Links

          Activity

          Adrien Grand made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Michael McCandless made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Assignee Michael McCandless [ mikemccand ]
          Fix Version/s 5.0 [ 12321663 ]
          Fix Version/s 4.5 [ 12324742 ]
          Resolution Fixed [ 1 ]
          Michael McCandless made changes -
          Link This issue blocks LUCENE-5180 [ LUCENE-5180 ]
          Michael McCandless made changes -
          Attachment LUCENE-3849.patch [ 12598788 ]
          Michael McCandless made changes -
          Attachment LUCENE-3849.patch [ 12598600 ]
          Robert Muir made changes -
          Attachment LUCENE-3849.patch [ 12541669 ]
          Robert Muir made changes -
          Field Original Value New Value
          Attachment LUCENE-3849.patch [ 12541653 ]
          Robert Muir created issue -

            People

            • Assignee:
              Michael McCandless
              Reporter:
              Robert Muir
            • Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development