Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
0.13.0
-
None
Description
It seems that the result from column statistics isn't correct on two measures for numeric columns: min (which is always 0) and distinct count. Here is an example:
select count(distinct avgTimeOnSite), min(avgTimeOnSite) from UserVisits_web_text_none; ... OK 9 1 Time taken: 9.747 seconds, Fetched: 1 row(s)
The statisitics for the column:
desc formatted UserVisits_web_text_none avgTimeOnSite ... # col_name data_type min max num_nulls distinct_count avg_col_len max_col_len num_trues num_falses comment avgTimeOnSite int 0 9 0 11 null null null
Attachments
Issue Links
- duplicates
-
HIVE-4561 Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)
- Closed
- is related to
-
HIVE-4561 Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)
- Closed