[HIVE-16600] Refactor SetSparkReducerParallelism#needSetParallelism to enable parallel order by in multi_insert cases - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 3.0.0
Component/s: None
Labels:
None

Description

multi_insert_gby.case.q

set hive.exec.reducers.bytes.per.reducer=256;
set hive.optimize.sampling.orderby=true;
drop table if exists e1;
drop table if exists e2;
create table e1 (key string, value string);
create table e2 (key string);
FROM (select key, cast(key as double) as keyD, value from src order by key) a
INSERT OVERWRITE TABLE e1
    SELECT key, value
INSERT OVERWRITE TABLE e2
    SELECT key;

select * from e1;
select * from e2;

the parallelism of Sort is 1 even we enable parallel order by("hive.optimize.sampling.orderby" is set as "true"). This is not reasonable because the parallelism should be calcuated by Utilities.estimateReducers
this is because SetSparkReducerParallelism#needSetParallelism returns false when children size of RS is greater than 1.
in this case, the children size of RS[2] is two.

the logical plan of the case

   TS[0]-SEL[1]-RS[2]-SEL[3]-SEL[4]-FS[5]
                            -SEL[6]-FS[7]

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HIVE-16600.1.patch
08/May/17 03:16
2 kB
liyunzhang
HIVE-16600.10.patch
06/Jun/17 06:47
56 kB
liyunzhang
HIVE-16600.11.patch
12/Jun/17 08:06
14 kB
liyunzhang
HIVE-16600.12.patch
13/Jun/17 02:53
37 kB
liyunzhang
HIVE-16600.13.patch
14/Jun/17 07:01
37 kB
liyunzhang
HIVE-16600.2.patch
10/May/17 04:59
32 kB
liyunzhang
HIVE-16600.3.patch
11/May/17 08:24
30 kB
liyunzhang
HIVE-16600.4.patch
15/May/17 02:38
30 kB
liyunzhang
HIVE-16600.5.patch
16/May/17 05:44
20 kB
liyunzhang
HIVE-16600.6.patch
18/May/17 05:07
43 kB
liyunzhang
HIVE-16600.7.patch
19/May/17 08:53
61 kB
liyunzhang
HIVE-16600.8.patch
24/May/17 06:37
61 kB
liyunzhang
HIVE-16600.9.patch
26/May/17 08:55
54 kB
liyunzhang
mr.explain
12/May/17 16:49
14 kB
liyunzhang
mr.explain.log.HIVE-16600
11/May/17 07:14
61 kB
liyunzhang
Node.java
06/Jun/17 06:47
1 kB
liyunzhang
TestSetSparkReduceParallelism_MultiInsertCase.java
06/Jun/17 06:47
14 kB
liyunzhang

Issue Links

links to

review board

Activity

People

Assignee:: liyunzhang

Reporter:: liyunzhang

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 05/May/17 21:26

Updated:: 22/May/18 23:59

Resolved:: 14/Jun/17 08:52