Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 3.1.0
-
None
-
ghx-label-6
Description
Incremental statistics currently store all required data in catalogd and all impalad coordinators. However, this data is only required when computing incremental statistics. In cases where incremental statistics is used on many partition columns (due to tables with many columns, many partitions or both), this data can dominate the overall memory footprint. This can lead to OOM's, increased network usage, and instability.
Add an option to avoid propagating incremental stats to all coordinators and instead, pull it on demand from the catalog only when needed by the compute incremental statistics statement.
Attachments
Issue Links
- is related to
-
IMPALA-7535 CatalogdMetaProvider should fetch incremental stats data on-demand
- Open
- relates to
-
IMPALA-7614 Impala 3.1 Doc: Document the New Invalidate Options
- Closed