[HIVE-9153] Perf enhancement on CombineHiveInputFormat and HiveInputFormat - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.1.0
Component/s: Spark
Labels:
None

Description

The default InputFormat is CombineHiveInputFormat and thus HOS uses this. However, Tez uses HiveInputFormat. Since tasks are relatively cheap in Spark, it might make sense for us to use HiveInputFormat as well. We should evaluate this on a query which has many input splits such as select count(*) from store_sales where something is not null.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

screenshot.PNG
18/Dec/14 12:44
104 kB
Rui Li
HIVE-9153.1-spark.patch
25/Dec/14 12:34
6 kB
Rui Li
HIVE-9153.1-spark.patch
25/Dec/14 18:02
6 kB
Brock Noland
HIVE-9153.2.patch
26/Dec/14 01:36
6 kB
Rui Li
HIVE-9153.3.patch
26/Dec/14 03:08
4 kB
Rui Li

Issue Links

is related to

SPARK-4921 TaskSetManager mistakenly returns PROCESS_LOCAL for NO_PREF tasks

Resolved

Activity

People

Assignee:: Rui Li

Reporter:: Brock Noland

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 17/Dec/14 22:43

Updated:: 20/Dec/16 05:25

Resolved:: 29/Dec/14 17:45