[HIVE-12165] wrong result when hive.optimize.sampling.orderby=true with some aggregate functions - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Critical
Resolution: Fixed
Affects Version/s: 2.1.0
Fix Version/s: 2.0.1, 2.1.0
Component/s: None
Labels:
None
Environment:

hortonworks 2.3

Description

This simple query give wrong result , when , i use the parallel order .

select count(*) , count(distinct dummyint ) , min(dummyint),max(dummyint) from foobar_1M ;

Current wrong result :

c0	c1	c2	c3
32740	32740	0	163695
113172	113172	163700	729555
54088	54088	729560	999995

Right result :

c0	c1	c2	c3
1000000	1000000	0	999999

The sql script for my test

drop table foobar_1 ;
create table foobar_1 ( dummyint int  , dummystr string ) ;
insert into table foobar_1 select count(*),'dummy 0'  from foobar_1 ;

drop table foobar_1M ;
create table foobar_1M ( dummyint bigint  , dummystr string ) ;

insert overwrite table foobar_1M
       select val_int  , concat('dummy ',val_int) from
             ( select ((((((d_1*10)+d_2)*10+d_3)*10+d_4)*10+d_5)*10+d_6) as val_int from foobar_1
                 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) tbl_1 as d_1
                 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) tbl_2 as d_2
                 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) tbl_3 as d_3
                 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) tbl_4 as d_4
                 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) tbl_5 as d_5
                 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) tbl_6 as d_6  ) as f                 ;


set hive.optimize.sampling.orderby.number=10000;
set hive.optimize.sampling.orderby.percent=0.1f;
set mapreduce.job.reduces=3 ;

set hive.optimize.sampling.orderby=false;

select count(*) , count(distinct dummyint ) , min(dummyint),max(dummyint) from foobar_1M ;

set hive.optimize.sampling.orderby=true;

select count(*) , count(distinct dummyint ) , min(dummyint),max(dummyint) from foobar_1M ;

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HIVE-12165.patch
17/Feb/16 22:25
1 kB
Aihua Xu

Activity

People

Assignee:: Aihua Xu

Reporter:: ErwanMAS

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 13/Oct/15 21:33

Updated:: 21/Jun/16 15:52

Resolved:: 22/Feb/16 15:22