Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
Impala 2.9.0
-
None
Description
There are various ways in which Parquet statistics such as num_rows and also parquet::Statistics can be used to speed up aggregation queries with min/max/count. Some of the improvements can be done at execution-time only, others also need query-plan modifications. The subtasks illustrate the various optimization opportunities/dimensions, and can be tackled separately.
Attachments
Issue Links
- is related to
-
IMPALA-7547 For distinct queries use dictionary encoded page instead of reading all data
- Open