Details
-
Improvement
-
Status: Open
-
Normal
-
Resolution: Unresolved
-
None
Description
I've heard reports from several users that they would like to have predicate pushdown functionality for hadoop (Hive in particular) based services.
Example usecase
Table with wide partitions, one per customer
Application team has HQL they would like to run on a single customer
Currently time to complete scales with number of customers since Input Format can't pushdown primary key predicate
Current implementation requires a full table scan (since it can't recognize that a single partition was specified)