Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
create table claims(claim_rec_id bigint, claim_invoice_num string, typ_c int); alter table claims update statistics set ('numRows'='1154941534','rawDataSize'='1135307527922'); set hive.stats.estimate=false; explain extended select count(1) from claims where typ_c=3; set hive.stats.ndv.estimate.percent=5e-7; explain extended select count(1) from claims where typ_c=3;
Expecting the standard /2 for the single filter, but we instead get 5 rows.
' Map Operator Tree:' ' TableScan' ' alias: claims' ' filterExpr: (typ_c = 3) (type: boolean)' ' Statistics: Num rows: 1154941534 Data size: 4388777832 Basic stats: COMPLETE Column stats: NONE' ' GatherStats: false' ' Filter Operator' ' isSamplingPred: false' ' predicate: (typ_c = 3) (type: boolean)' ' Statistics: Num rows: 5 Data size: 19 Basic stats: COMPLETE Column stats: NONE'
The estimation is in effect, as changing the estimate.percent changes this.
' filterExpr: (typ_c = 3) (type: boolean)' ' Statistics: Num rows: 1154941534 Data size: 4388777832 Basic stats: COMPLETE Column stats: NONE' ' GatherStats: false' ' Filter Operator' ' isSamplingPred: false' ' predicate: (typ_c = 3) (type: boolean)' ' Statistics: Num rows: 230988307 Data size: 877755567 Basic stats: COMPLETE Column stats: NONE'
Attachments
Attachments
Issue Links
- Is contained by
-
HIVE-26751 Bug Fixes and Improvements for 3.2.0 release
- Open
- is related to
-
HIVE-25985 Estimate stats gives out incorrect number of columns during query planning when using predicates like c=22
- Open
- relates to
-
HIVE-21793 CBO retrieves column stats even if hive.stats.fetch.column.stats is set to false
- Closed
- links to