[SPARK-18572] Use the hive client method "getPartitionNames" to answer "SHOW PARTITIONS" queries on partitioned Hive tables - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.1.0
Component/s: SQL
Labels:
None

Description

Currently Spark answers the SHOW PARTITIONS query by fetching all of the table's partition metadata from the external catalog and constructing partition names therefrom. The Hive client has a getPartitionNames method which is orders of magnitude faster, with the performance improvement scaling up with the number of partitions in the table. I believe we can use this method to great effect.

Further details are provided in the associated PR.

Attachments

Issue Links

links to

[Github] Pull Request #15998 (mallman)

Activity

People

Assignee:: Michael MacFadden

Reporter:: Michael MacFadden

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 23/Nov/16 21:43

Updated:: 06/Dec/16 03:34

Resolved:: 06/Dec/16 03:34