Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4854

COMPUTE INCREMENTAL STATS should ignore missing stats on complex columns

    XMLWordPrintableJSON

Details

    Description

      After executing "compute incremental stats" on a table, by design, future calls to "compute incremental stats" only compute stats for partitions for which there are no statistics. However, when statistics are found to be missing for a column, e.g. a column was added to the schema since the last computation, incremental stats will be recomputed for all partitions. Impala doesn't currently compute statistics for complex columns, such as arrays and structs. Because of this, stats for these types of columns are always found to be missing, which incorrectly causes stats to be re-computed for all partitions on every run. Missing stats for complex columns on previously stat-computed partitions should be ignored when determining if re-computation is necessary, as re-computing stats will never remedy this situation.

      Attachments

        Activity

          People

            alex.behm Alexander Behm
            ngsalmon Nathan Salmon
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: