[HIVE-7371] Identify a minimum set of JARs needed to ship to Spark cluster [Spark Branch] - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.1.0
Component/s: Spark
Labels:
None

Description

Currently, Spark client ships all Hive JARs, including those that Hive depends on, to Spark cluster when a query is executed by Spark. This is not efficient, causing potential library conflicts. Ideally, only a minimum set of JARs needs to be shipped. This task is to identify such a set.

We should learn from current MR cluster, for which I assume only hive-exec JAR is shipped to MR cluster.

We also need to ensure that user-supplied JARs are also shipped to Spark cluster, in a similar fashion as MR does.

NO PRECOMMIT TESTS. This is for spark-branch only.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HIVE-7371-Spark.1.patch
17/Jul/14 06:29
7 kB
Chengxiang Li
HIVE-7371-Spark.2.patch
17/Jul/14 08:12
7 kB
Chengxiang Li
HIVE-7371-Spark.3.patch
17/Jul/14 19:41
7 kB
Xuefu Zhang

Issue Links

is depended upon by

HIVE-7437 Check if servlet-api and jetty module in Spark library are an issue for hive-spark integration [Spark Branch]

Resolved

is part of

HIVE-7292 Hive on Spark

Resolved

Activity

People

Assignee:: Chengxiang Li

Reporter:: Xuefu Zhang

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 09/Jul/14 15:01

Updated:: 29/May/15 02:29

Resolved:: 18/Jul/14 18:28