Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
The column stats for the table sample_pt partition (dummy=1) is as following:
hive> describe formatted sample_pt partition (dummy=1) code; OK # col_name data_type min max num_nulls distinct_count avg_col_len max_col_len num_trues num_falses comment code string 0 303 6.985 7 from deserializer Time taken: 0.259 seconds, Fetched: 3 row(s)
But when this partition is renamed, say
alter table sample_pt partition (dummy=1) rename to partition (dummy=11);
The COLUMN_STATS in partition description are true, but column stats are actually all deleted.
hive> describe formatted sample_pt partition (dummy=11); OK # col_name data_type comment code string description string salary int total_emp int # Partition Information # col_name data_type comment dummy int # Detailed Partition Information Partition Value: [11] Database: default Table: sample_pt CreateTime: Thu Mar 30 23:03:59 EDT 2017 LastAccessTime: UNKNOWN Location: file:/user/hive/warehouse/apache/sample_pt/dummy=11 Partition Parameters: COLUMN_STATS_ACCURATE {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} numFiles 1 numRows 200 rawDataSize 10228 totalSize 10428 transient_lastDdlTime 1490929439 # Storage Information SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat: org.apache.hadoop.mapred.TextInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Compressed: No Num Buckets: -1 Bucket Columns: [] Sort Columns: [] Storage Desc Params: serialization.format 1 Time taken: 6.783 seconds, Fetched: 37 row(s) === hive> describe formatted sample_pt partition (dummy=11) code; OK # col_name data_type comment code string from deserializer Time taken: 9.429 seconds, Fetched: 3 row(s)
The column stats should not be drop when a partition is renamed.
Attachments
Attachments
Issue Links
- links to