Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
Description
When the users query a bucketed table and use the bucketed column in the predicate, only the buckets that satisfy the predicate need to be scanned, thus improving the performance.
Given a table test:
CREATE TABLE test (x INT, y STRING) CLUSTERED BY ( x ) INTO 10 BUCKETS;
The following query only has to scan bucket 5:
SELECT * FROM test WHERE x=5;
Attachments
Attachments
Issue Links
- duplicates
-
HIVE-11525 Bucket pruning
- Closed
- is duplicated by
-
HIVE-4926 Queries which specify clustered-by keys as constants will still scan all buckets
- Open
- is related to
-
HIVE-1662 Add file pruning into Hive.
- Patch Available