[HIVE-1940] Query Optimization Using Column Statistics and Histograms - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: None
Fix Version/s: None
Component/s: Metastore, Query Processor, Statistics
Labels:
None

Tags:
MetaStore

Description

The current basis for cost-based query optimization in Hive is information gathered on tables and partitions. To make further improvements in query optimization possible, the next step is to develop and implement possibilities to gather information on columns as discussed in issue ~~HIVE-33~~. After that, an implementation of histograms is a possible option to use and collect run-time statistics. Next to the actual implementation of these features, it is also necessary to develop a consistent storage model for the MetaStore.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HiveMetaStore.pdf
15/Feb/11 20:16
221 kB
Anja Gruenheid
Agruenheid_ideas11.pdf
14/May/12 18:32
253 kB
Carl Steinbach

Issue Links

duplicates

HIVE-1362 Optimizer statistics on columns in tables and partitions

Closed

HIVE-1938 Cost Based Query optimization for Joins in Hive

Resolved

is related to

HIVE-33 [Hive]: Add optimizer statistics in Hive

Resolved

HIVE-1938 Cost Based Query optimization for Joins in Hive

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Anja Gruenheid

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Dates

Created:: 02/Feb/11 00:51

Updated:: 23/Jan/13 21:05

Resolved:: 12/Jun/12 00:15