Lucene - Core
  1. Lucene - Core
  2. LUCENE-1304

Memory Leak when using Custom Sort (i.e., DistanceSortSource) of LocalLucene with Lucene

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 2.3
    • Fix Version/s: 2.9
    • Component/s: core/search
    • Labels:
      None
    • Environment:

      Windows/JDK 1.6

    • Lucene Fields:
      New, Patch Available

      Description

      We had the memory leak issue when using DistanceSortSource of LocalLucene for repeated query/search. In about 450 queries, we are experiencing out of memory error. After dig in the code, we found the problem source is coming from Lucene package, the way how it handles "custom" type comparator. Lucene internally caches all created comparators. In the case of query using LocalLucene, we create new comparator for every search due to different lon/lat and query terms. This causes major memory leak as the cached comparators are also holding memory for other large objects (e.g., bit sets). The solution we came up with: ( the proposed change from Lucene is 1 and 3 below)

      1. In Lucene package, create new file SortComparatorSourceUncacheable.java:

      package org.apache.lucene.search;

      import org.apache.lucene.index.IndexReader;
      import java.io.IOException;
      import java.io.Serializable;

      public interface SortComparatorSourceUncacheable extends Serializable {
      }

      2. Have your custom sort class to implement the interface

      public class LocalSortSource extends DistanceSortSource implements SortComparatorSourceUncacheable

      { ... }

      3. Modify Lucene's FieldSorterHitQueue.java to bypass caching for custom sort comparator:

      Index: FieldSortedHitQueue.java
      ===================================================================
      — FieldSortedHitQueue.java (revision 654583)
      +++ FieldSortedHitQueue.java (working copy)
      @@ -53,7 +53,12 @@
      this.fields = new SortField[n];
      for (int i=0; i<n; ++i) {
      String fieldname = fields[i].getField();

      • comparators[i] = getCachedComparator (reader, fieldname, fields[i].getType(), fields[i].getLocale(), fields[i].getFactory());
        +
        + if(fields[i].getFactory() instanceof SortComparatorSourceUncacheable) { // no caching to avoid memory leak + comparators[i] = getComparator (reader, fieldname, fields[i].getType(), fields[i].getLocale(), fields[i].getFactory()); + }

        else

        { + comparators[i] = getCachedComparator (reader, fieldname, fields[i].getType(), fields[i].getLocale(), fields[i].getFactory()); + }

      if (comparators[i].sortType() == SortField.STRING) {
      this.fields[i] = new SortField (fieldname, fields[i].getLocale(), fields[i].getReverse());
      @@ -157,7 +162,18 @@
      SortField[] getFields()

      { return fields; }
      • +
        + static ScoreDocComparator getComparator (IndexReader reader, String field, int type, Locale locale, SortComparatorSource factory)
        + throws IOException

        { + if (type == SortField.DOC) return ScoreDocComparator.INDEXORDER; + if (type == SortField.SCORE) return ScoreDocComparator.RELEVANCE; + FieldCacheImpl.Entry entry = (factory != null) + ? new FieldCacheImpl.Entry (field, factory) + : new FieldCacheImpl.Entry (field, type, locale); + return (ScoreDocComparator)Comparators.createValue(reader, entry); + }

        +
        +

      Otis suggests that I put this in Jira. I 'll attach a patch shortly for review.

      -Ethan

        Issue Links

          Activity

          Hide
          Mark Miller added a comment -

          LUCENE-1483 and a new FieldComparator I saw going in the other day should solve this issue.

          Show
          Mark Miller added a comment - LUCENE-1483 and a new FieldComparator I saw going in the other day should solve this issue.
          Hide
          Ethan Tao added a comment -

          Thank you !!

          Show
          Ethan Tao added a comment - Thank you !!
          Hide
          Mark Miller added a comment -

          The main impact is that most of that code will be deprecated. It will still be used for old custom comparators until 3.0 though, so it might be wise to consider this for 2.9 in the interim.

          Show
          Mark Miller added a comment - The main impact is that most of that code will be deprecated. It will still be used for old custom comparators until 3.0 though, so it might be wise to consider this for 2.9 in the interim.
          Hide
          patrick o'leary added a comment -

          How will LUCENE-1483 impact this immediately?
          I'd really like to get this patch in first and refactor if and when 1483 goes in, the benefit of bypassing static comparator is
          really needed.

          Show
          patrick o'leary added a comment - How will LUCENE-1483 impact this immediately? I'd really like to get this patch in first and refactor if and when 1483 goes in, the benefit of bypassing static comparator is really needed.
          Hide
          Mark Miller added a comment -

          This issue will likely be affected by LUCENE-1483. If/when it goes in, its not likely that Comparators (custom or otherwise) will be cached anymore.

          Show
          Mark Miller added a comment - This issue will likely be affected by LUCENE-1483 . If/when it goes in, its not likely that Comparators (custom or otherwise) will be cached anymore.
          Hide
          Ethan Tao added a comment -

          This is the patch created off 2.3.2. Please feel free to make changes.

          Show
          Ethan Tao added a comment - This is the patch created off 2.3.2. Please feel free to make changes.

            People

            • Assignee:
              Unassigned
              Reporter:
              Ethan Tao
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development