Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-10068

Switch to a "double barrel" HPPC cache for the taxonomy LRU cache

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 8.8.1
    • None
    • modules/facet
    • None
    • New

    Description

      While working on an unrelated getBulkPath API PR, Michael McCandless and I came across a nice optimization that could be made to the taxonomy cache.

      The taxonomy cache today caches frequently used ordinals and their corresponding FacetLabels. It uses the existing LRUHashMap (backed by a LinkedList) class for its implementation.

      This implementation performs sub optimally when it has a large number of threads accessing it, and consumes a large amount of RAM.

      Michael McCandless suggested the idea of a two array backed HPPC int->FacetLabel cache. The basic idea behind the cache being:

      1. We use two hashmaps primary and secondary.
      2. In case of a cache miss in the primary and a cache hit in the secondary, we add the key to the primary map as well.
      3. In case of a cache miss in both the maps, we add it to the primary map.
      4. When we reach (make this check each time we insert?) a large number of elements in say the primary cache, (say larger than the existing DEFAULT_CACHE_VALUE=4000), we dump the secondary map and copy all the values of the primary map into it.

      The idea was originally explained in this comment.

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            gworah Gautam Worah

            Dates

              Created:
              Updated:

              Slack

                Issue deployment