Lucene - Core
  1. Lucene - Core
  2. LUCENE-5798

minor optimizations to MultiDocs(AndPositions)Enum.reset()

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.10, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      This method is called by merging for each term, potentially many times, but only returning a few docs for each invocation (e.g. imagine high cardinality fields, unique id fields, normal zipf distribution on full text).

      Today we create a new EnumWithSlice[] array and new EnumWithSlice entry for each term, but this creates a fair amount of unnecessary garbage: instead we can just make this array up-front as size subReaderCount and reuse it.

        Activity

        Hide
        Robert Muir added a comment -

        In my benchmarking this seems to give a fairly substantial improvement when merging terms.

        Show
        Robert Muir added a comment - In my benchmarking this seems to give a fairly substantial improvement when merging terms.
        Hide
        Ryan Ernst added a comment -

        LGTM!

        Show
        Ryan Ernst added a comment - LGTM!
        Hide
        ASF subversion and git services added a comment -

        Commit 1607049 from Robert Muir in branch 'dev/trunk'
        [ https://svn.apache.org/r1607049 ]

        LUCENE-5798: Optimize MultiDocsEnum reuse

        Show
        ASF subversion and git services added a comment - Commit 1607049 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1607049 ] LUCENE-5798 : Optimize MultiDocsEnum reuse
        Hide
        ASF subversion and git services added a comment -

        Commit 1607055 from Robert Muir in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1607055 ]

        LUCENE-5798: Optimize MultiDocsEnum reuse

        Show
        ASF subversion and git services added a comment - Commit 1607055 from Robert Muir in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1607055 ] LUCENE-5798 : Optimize MultiDocsEnum reuse
        Hide
        Adrien Grand added a comment -

        +1

        Show
        Adrien Grand added a comment - +1

          People

          • Assignee:
            Unassigned
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development