[HIVE-8920] IOContext problem with multiple MapWorks cloned for multi-insert [Spark Branch] - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: spark-branch
Fix Version/s: 1.1.0
Component/s: Spark
Labels:
None

Description

The following query will not work:

from (select * from table0 union all select * from table1) s
insert overwrite table table3 select s.x, count(1) group by s.x
insert overwrite table table4 select s.y, count(1) group by s.y;

Currently, the plan for this query, before SplitSparkWorkResolver, looks like below:

In SplitSparkWorkResolver#splitBaseWork, it assumes that the childWork is a ReduceWork, but for this case, you can see that for M2 the childWork could be UnionWork U3. Thus, the code will fail.

~~HIVE-9041~~ addressed partially addressed the problem by removing union task. However, it's still necessary to cloning M1 and M2 to support multi-insert. Because M1 and M2 can run in a single JVM, the original solution of storing a global IOContext will not work because M1 and M2 have different io contexts, both needing to be stored.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HIVE-8920.4-spark.patch
30/Dec/14 16:17
0.4 kB
Xuefu Zhang
HIVE-8920.3-spark.patch
30/Dec/14 06:15
33 kB
Xuefu Zhang
HIVE-8920.2-spark.patch
18/Dec/14 14:38
36 kB
Xuefu Zhang
HIVE-8920.1-spark.patch
18/Dec/14 06:24
36 kB
Xuefu Zhang

Issue Links

depends upon

HIVE-9041 Generate better plan for queries containing both union and multi-insert [Spark Branch]

Resolved

relates to

HIVE-9084 Investigate IOContext object initialization problem [Spark Branch]

Resolved

links to

https://reviews.apache.org/r/29205/

Activity

People

Assignee:: Xuefu Zhang

Reporter:: Chao Sun

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 20/Nov/14 00:30

Updated:: 29/May/15 02:28

Resolved:: 30/Dec/14 20:57