[HIVE-11525] Bucket pruning - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.13.0, 0.13.1, 0.14.0, 1.0.0, 1.1.0, 1.2.0, 1.3.0, 2.0.0
Fix Version/s: 2.0.0
Component/s: Logical Optimizer
Labels:
- TODOC2.0

Release Note:
Tez bucket pruning

Description

Logically and functionally bucketing and partitioning are quite similar - both provide mechanism to segregate and separate the table's data based on its content. Thanks to that significant further optimisations like [partition] PRUNING or [bucket] MAP JOIN are possible.
The difference seems to be imposed by design where the PARTITIONing is open/explicit while BUCKETing is discrete/implicit.
Partitioning seems to be very common if not a standard feature in all current RDBMS while BUCKETING seems to be HIVE specific only.
In a way BUCKETING could be also called by "hashing" or simply "IMPLICIT PARTITIONING".

Regardless of the fact that these two are recognised as two separate features available in Hive there should be nothing to prevent leveraging same existing query/join optimisations across the two.

BUCKET pruning
Enable partition PRUNING equivalent optimisation for queries on BUCKETED tables

Simplest example is for queries like:
"SELECT … FROM x WHERE colA=123123"
to read only the relevant bucket file rather than all file-buckets that belong to a table.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HIVE-11525.1.patch
03/Nov/15 08:25
114 kB
Gopal Vijayaraghavan
HIVE-11525.2.patch
03/Nov/15 20:22
195 kB
Gopal Vijayaraghavan
HIVE-11525.3.patch
06/Nov/15 08:25
193 kB
Gopal Vijayaraghavan
HIVE-11525.WIP.patch
07/Oct/15 01:47
27 kB
Takuya Fukudome

Issue Links

blocks

HIVE-12379 FetchTask conversion for BucketPruning

Open

is a clone of

HIVE-9523 For partitioned tables same optimizations should be available as for bucketed tables and vice versa: ①[Sort Merge] PARTITION Map join and ②BUCKET pruning

Open

is duplicated by

HIVE-5831 filter input files for bucketed tables

Resolved

is related to

HIVE-16177 non Acid to acid conversion doesn't handle _copy_N files

Closed

relates to

HIVE-14199 Enable Bucket Pruning for ACID tables

Resolved

links to

ReviewBoard #39916

(1 links to)

Activity

People

Assignee:: Gopal Vijayaraghavan

Reporter:: Maciek Kocon

Votes:: 0 Vote for this issue

Watchers:: 16 Start watching this issue

Dates

Created:: 11/Aug/15 22:07

Updated:: 11/Oct/17 17:20

Resolved:: 13/Nov/15 02:38