Description
Refactor ColumnStat to be more flexible.
- Split ColumnStat and CatalogColumnStat just like CatalogStatistics is split from Statistics. This detaches how the statistics are stored from how they are processed in the query plan. CatalogColumnStat keeps min and max as String, making it not depend on dataType information.
- For CatalogColumnStat, parse column names from property names in the metastore ({{KEY_VERSION }}property), not from metastore schema. This allows the catalog to read stats into {{CatalogColumnStat}}s even if the schema itself is not in the metastore.
- Make all fields optional. min, max and histogram for columns were optional already. Having them all optional is more consistent, and gives flexibility to e.g. drop some of the fields through transformations if they are difficult / impossible to calculate.
The added flexibility will make it possible to have alternative implementations for stats, and separates stats collection from stats and estimation processing in plans.