Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-2369

Locale-based sort by field with low memory overhead

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 4.0
    • Fix Version/s: None
    • Component/s: core/search
    • Lucene Fields:
      New

      Description

      The current implementation of locale-based sort in Lucene uses the FieldCache which keeps all sort terms in memory. Beside the huge memory overhead, searching requires comparison of terms with collator.compare every time, making searches with millions of hits fairly expensive.

      This proposed alternative implementation is to create a packed list of pre-sorted ordinals for the sort terms and a map from document-IDs to entries in the sorted ordinals list. This results in very low memory overhead and faster sorted searches, at the cost of increased startup-time. As the ordinals can be resolved to terms after the sorting has been performed, this approach supports fillFields=true.

      This issue is related to https://issues.apache.org/jira/browse/LUCENE-2335 which contain previous discussions on the subject.

        Attachments

        1. LUCENE-2369.patch
          637 kB
          Toke Eskildsen
        2. LUCENE-2369.patch
          523 kB
          Toke Eskildsen
        3. LUCENE-2369.patch
          458 kB
          Toke Eskildsen
        4. lucene-2369-20101011.patch
          443 kB
          Toke Eskildsen
        5. LUCENE-2369.patch
          342 kB
          Toke Eskildsen
        6. LUCENE-2369.patch
          220 kB
          Toke Eskildsen

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              toke Toke Eskildsen
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: