Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-6077

Add a filter cache

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 5.0
    • None
    • None
    • New

    Description

      Lucene already has filter caching abilities through CachingWrapperFilter, but CachingWrapperFilter requires you to know which filters you want to cache up-front.

      Caching filters is not trivial. If you cache too aggressively, then you slow things down since you need to iterate over all documents that match the filter in order to load it into an in-memory cacheable DocIdSet. On the other hand, if you don't cache at all, you are potentially missing interesting speed-ups on frequently-used filters.

      Something that would be nice would be to have a generic filter cache that would track usage for individual filters and make the decision to cache or not a filter on a given segments based on usage statistics and various heuristics, such as:

      • the overhead to cache the filter (for instance some filters produce DocIdSets that are already cacheable)
      • the cost to build the DocIdSet (the getDocIdSet method is very expensive on some filters such as MultiTermQueryWrapperFilter that potentially need to merge lots of postings lists)
      • the segment we are searching on (flush segments will likely be merged right away so it's probably not worth building a cache on such segments)

      Attachments

        1. LUCENE-6077.patch
          41 kB
          Adrien Grand
        2. LUCENE-6077.patch
          52 kB
          Adrien Grand
        3. LUCENE-6077.patch
          54 kB
          Adrien Grand

        Activity

          People

            jpountz Adrien Grand
            jpountz Adrien Grand
            Votes:
            1 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: