Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
Impala 2.6.0
Description
After executing "compute incremental stats" on a table, by design, future calls to "compute incremental stats" only compute stats for partitions for which there are no statistics. However, when statistics are found to be missing for a column, e.g. a column was added to the schema since the last computation, incremental stats will be recomputed for all partitions. Impala doesn't currently compute statistics for complex columns, such as arrays and structs. Because of this, stats for these types of columns are always found to be missing, which incorrectly causes stats to be re-computed for all partitions on every run. Missing stats for complex columns on previously stat-computed partitions should be ignored when determining if re-computation is necessary, as re-computing stats will never remedy this situation.