Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-10503

Aggregate stats cache: follow up optimizations

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.2.0
    • 1.3.0
    • Metastore
    • None

    Description

      Some follow up work items:
      1. Estimate cache nodes from memory size - currently the user needs to specify size based on #nodes.
      2. Make the AggregateStatsCache#add method asynchronous - adding to cache can happen in a new thread.
      3. Based on perf testing, explore an alternate data structure for the node list per cache key.
      4. Explore ideas to reduce locking granularity of the value list per cache key.
      5. There is an O(n*n) loop while finding the match - that should go away.
      6. Single call to DB to get aggregate for columns not in cache.
      7. Organize metrics capturing in a better way.
      8. Address concerns on TTL causing stale data in cache.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              vgumashta Vaibhav Gumashta
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: