[HIVE-7370] Initial ground work for Hive on Spark [Spark branch] - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: spark-branch
Component/s: Spark
Labels:
None

Description

Contribute PoC code to Hive on Spark as the ground work for subsequent tasks. While it has hacks and bad organized code, it will change and more importantly it allows multiple people to working on different components concurrently.

With this, simple queries such as "select col from tab where ..." and "select grp, avg(val) from tab group by grp where ..." can be executed on Spark.

Contents of the patch:
1. code path for additional execution engine
2. essential classes such as SparkWork, SparkTask, SparkCompiler, HiveMapFunction, HiveReduceFunction, SparkClient, etc.
3. Some code changes to existing classes.
4. build infrastructure
5. utility classes.

To try run Hive on Spark, for now you need to have:
1. self-built Spark 1.0.0 with the patch attached.
2. invoke Hive client with environment variable MASTER, which points to master URL of Spark.
2. set hive.execution.engine=spark
3. execute supported queries.

NO PRECOMMIT TESTS. This is for spark branch only.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

spark_1.0.0.patch
09/Jul/14 14:06
3 kB
Xuefu Zhang
HIVE-7370.patch
09/Jul/14 03:36
42 kB
Xuefu Zhang

Issue Links

is part of

HIVE-7292 Hive on Spark

Resolved

Activity

People

Assignee:: Xuefu Zhang

Reporter:: Xuefu Zhang

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 09/Jul/14 02:06

Updated:: 09/Jul/14 14:17

Resolved:: 09/Jul/14 14:17