[HIVE-7438] Counters, statistics, and metrics [Spark Branch] - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: None
Component/s: Spark
Labels:
- Spark-M3
- TODOC-SPARK

Description

Hive makes use of MapReduce counters for statistics and possibly for other purposes. For Hive on Spark, we should achieve the same functionality using Spark's accumulators.

Hive also collects metrics from MapReduce jobs traditionally. Spark job very likely publishes a different set of metrics, which, if made available, would help user to get insights into their spark jobs. Thus, we should obtain the metrics and make them available as we do for MapReduce.

This task therefore includes:

identify Hive's existing functionality w.r.t. counters, statistics, and metrics;
design and implement the same functionality in Spark.

Please refer to the design document for more information. https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark#HiveonSpark-CountersandMetrics

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

hive on spark job statistic design.docx
31/Jul/14 06:26
15 kB
Chengxiang Li

Issue Links

depends upon

HIVE-7706 ClassCastException trying to CTAS table

Resolved

HIVE-7551 expand spark accumulator to support hive counter [Spark Branch]

Resolved

HIVE-7552 Collect spark job statistic through spark metrics [Spark Branch]

Resolved

incorporates

HIVE-7893 Find a way to get a job identifier when submitting a spark job [Spark Branch]

Resolved

is depended upon by

HIVE-7772 Add tests for order/sort/distribute/cluster by query [Spark Branch]

Resolved

Activity

People

Assignee:: Chengxiang Li

Reporter:: Xuefu Zhang

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 17/Jul/14 20:29

Updated:: 05/Dec/14 14:39

Resolved:: 05/Dec/14 02:19