Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
For some tables, we might not have the column stats available. However, if the table is partitioned, we will have the stats for partition columns.
When we estimate the size of the data produced by a join operator, we end up using only the columns that are available for the calculation e.g. partition columns in this case.
However, even in these cases, we should add the data size for those columns for which we do not have stats (default size for the column type x estimated number of rows).
To reproduce, the following example can be used:
create table sample_partitioned (x int) partitioned by (y int); insert into sample_partitioned partition(y=1) values (1),(2); create temporary table sample as select * from sample_partitioned; analyze table sample compute statistics for columns; explain select sample_partitioned.x from sample_partitioned, sample where sample.y = sample_partitioned.y;