Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-4854

DocTermsOrd getOrdTermsEnum() buggy, lookupTerm/termsEnum is slow

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.3, 4.2.1, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Investigating a test failure in grouping/ I found the current dv api needs help for DocTermsOrds (this facet+grouping collector uses seekExact(BytesRef) on the termsenum):

      • termsenum.seekExact is slow because the default implementation calls lookupTerm, which is slow. but this thing already has an optimal termsenum it can just return directly (since LUCENE-4819)
      • lookupTerm is slow because the default implementation binary-searches ordinal space, calling lookupOrd and comparing to the target. However, lookupOrd is slow for this thing (must binary-search ordinal space again, then next() at most index_interval times).
      • its getOrdTermsEnum() method is buggy: doesn't position correctly on an initial next(). Nothing uses this today, but if we want to return this thing directly it needs to work: its just a trivial check contained within next()

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              rcmuir Robert Muir
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: