Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-4314

The specification of DocIdSetIterator is needlessly ambiguous.

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.6.1, 4.0-BETA
    • 4.0, 6.0
    • core/search
    • All

    • New

    Description

      Quoth Lucene at org.apache.lucene.search.DocIdSetIterator.advance:

      "Advances to the first beyond (see NOTE below) the current whose document
      number is greater than or equal to <i>target</i>. [...]
      NOTE:</b> when <code> target ≤ current</code> implementations may opt
      not to advance beyond their current

      {@link #docID()}

      ."

      However, the same specification contradictorily states that advance must behave as if written:

      int advance(int target) {
      int doc;
      while ((doc = nextDoc()) < target) {}
      return doc;
      }

      That is, with at least one call to nextDoc() always made, unconditionally.

      This ambiguity can lead to unexpected behavior. In fact, arguably every user of this interface that does not test after every call whether the iterator has exhausted AND has advanced is incorrect.

      For example, I myself had one experimental implementation (coded against a previous Lucene release) that caused an infinite loop in PhraseScorer.java because, following the above specification, it "opted" not to move the iterator when advance(target) was called with target < current.

      Attachments

        1. DocIdSetIterator.patch
          2 kB
          Franco Callari

        Activity

          People

            mikemccand Michael McCandless
            fgc Franco Callari
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: