Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-19715

Consolidated and flexible API for fetching partition metadata from HMS

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Standalone Metastore
    • None

    Description

      Currently, the HMS thrift API exposes 17 different APIs for fetching partition-related information. There is somewhat of a combinatorial explosion going on, where each API has variants with and without "auth" info, by pspecs vs names, by filters, by exprs, etc. Having all of these separate APIs long term is a maintenance burden and also more confusing for consumers.

      Additionally, even with all of these APIs, there is a lack of granularity in fetching only the information needed for a particular use case. For example, in some use cases it may be beneficial to only fetch the partition locations without wasting effort fetching statistics, etc.

      This JIRA proposes that we add a new "one API to rule them all" for fetching partition info. The request and response would be encapsulated in structs. Some desirable properties:

      • the request should be able to specify which pieces of information are required (eg location, properties, etc)
      • in the case of partition parameters, the request should be able to do either whitelisting or blacklisting (eg to exclude large incremental column stats HLL dumped in there by Impala)
      • the request should optionally specify auth info (to encompas the "with_auth" variants)
      • the request should be able to designate the set of partitions to access through one of several different methods (eg "all", list<name>, expr, part_vals, etc)
      • the struct should be easily evolvable so that new pieces of info can be added
      • the response should be designed in such a way as to avoid transferring redundant information for common cases (eg simple "dictionary coding" of strings like parameter names, etc)
      • the API should support some form of pagination for tables with large partition counts

      Attachments

        1. HIVE-19715-design-doc.pdf
          146 kB
          Vihang Karajgaonkar

        Issue Links

          Activity

            People

              vihangk1 Vihang Karajgaonkar
              tlipcon Todd Lipcon
              Votes:
              0 Vote for this issue
              Watchers:
              23 Start watching this issue

              Dates

                Created:
                Updated: