Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-16520

Cache hive metadata in metastore

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 3.0.0
    • Metastore
    • None
    • Reviewed
    • To use CachedStore, please set hive.metastore.rawstore.impl to "org.apache.hadoop.hive.metastore.cache.CachedStore" in hive-site.xml.

    Description

      During Hive 2 benchmark, we find Hive metastore operation take a lot of time and thus slow down Hive compilation. In some extreme case, it takes much longer than the actual query run time. Especially, we find the latency of cloud db is very high and 90% of total query runtime is waiting for metastore SQL database operations. Based on this observation, the metastore operation performance will be greatly enhanced if we have a memory structure which cache the database query result.

      Attachments

        1. HIVE-16520-proto-2.patch
          84 kB
          Daniel Dai
        2. HIVE-16520-proto.patch
          83 kB
          Daniel Dai
        3. HIVE-16520-1.patch
          105 kB
          Daniel Dai
        4. HIVE-16520.4.patch
          107 kB
          Daniel Dai
        5. HIVE-16520.3.patch
          106 kB
          Daniel Dai
        6. HIVE-16520.2.patch
          107 kB
          Daniel Dai

        Issue Links

          1.
          Fix Unit test failures when CachedStore is enabled Sub-task Closed Daniel Dai  
          2.
          CachedStore: improvements to partition col stats caching and cache column stats for unpartitioned table Sub-task Closed Vaibhav Gumashta  
          3.
          CachedStore: make prewarm and background cache update multithreaded Sub-task Open Unassigned  
          4.
          Fix remaining unit test failures when CachedStore is enabled Sub-task Open Daniel Dai  
          5.
          CachedStore: Store cached partitions/col stats within the table cache and make prewarm non-blocking Sub-task Resolved Vaibhav Gumashta  
          6.
          CachedStore: Prioritize loading of recently accessed tables during prewarm Sub-task Closed Vaibhav Gumashta  
          7.
          CachedStore: Investigate TestCachedStore#testTableColStatsOps Sub-task Open Unassigned  
          8.
          CachedStore: Use metastore notification log events to update cache Sub-task Resolved mahesh kumar behera

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 20m
          9.
          CachedStore leak PersistenceManager resources Sub-task Closed Daniel Dai  
          10.
          NPE during CachedStore refresh Sub-task Closed Daniel Dai  
          11.
          CachedStore - wait for prewarm at use time, not init time Sub-task Closed Sergey Shelukhin  
          12.
          CachedStore: Have a whitelist/blacklist config to allow selective caching of tables/partitions and allow read while prewarming Sub-task Closed Daniel Dai  
          13.
          CachedStore: prewarm improvement (avoid multiple sql calls to read partition column stats), refactoring and caching some aggregate stats Sub-task Closed Vaibhav Gumashta  
          14.
          CachedStore: Use memory estimation to limit cache size during prewarm Sub-task Closed Vaibhav Gumashta  
          15.
          CachedStore: bug fixes for TestEmbeddedHiveMetaStore, TestRemoteHiveMetaStore, TestMiniLlapCliDriver, TestMiniTezCliDriver, TestMinimrCliDriver Sub-task Resolved Vaibhav Gumashta  
          16.
          CachedStore: bug fixes for q file tests: TestMiniLlapCliDriver, TestMiniTezCliDriver, TestMinimrCliDriver Sub-task Resolved Vaibhav Gumashta  
          17.
          CachedStore: Fix UT when CachedStore is enabled Sub-task Patch Available Unassigned  
          18.
          CachedStore: Add more UT coverage (outside of .q files) Sub-task Resolved Vaibhav Gumashta  
          19.
          CachedStore: Run a select q file tests with CachedStore enabled Sub-task Open Unassigned  
          20.
          CachedStore: Background refresh thread bug fixes Sub-task Resolved Vaibhav Gumashta  

          Activity

            People

              daijy Daniel Dai
              daijy Daniel Dai
              Votes:
              0 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m