[MAPREDUCE-7208] Tuning TaskRuntimeEstimator - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 3.3.0, 3.1.4, 3.2.2, 2.10.1
Component/s: None
Labels:
None

Description

By default, MR uses LegacyTaskRuntimeEstimator to get an estimate of the runtime. The estimator does not adjust dynamically to the progress rate of the tasks. On the other hand, the existing alternative "ExponentiallySmoothedTaskRuntimeEstimator" behavior in unpredictable.

There are several dimensions to improve the exponential implementation:

Exponential shooting needs a warmup period. Otherwise, the estimate will be affected by the initial values.
Using a single smoothing factor (Lambda) does not work well for all the tasks. To increase the level of smoothing across the majority of tasks, we need to give a range of flexibility to dynamically adjust the smoothing factor based on the history of the task progress.
Design wise, it is better to separate between the statistical model and the MR interface. We need to have a way to evaluate estimators statistically, without the need to run MR. For example, an estimator can be evaluated as a black box by using a stream of raw data as input and testing the accuracy of the generated stream of estimates.
The exponential estimator speculates frequently and fails to detect slowing tasks. It does not detect slowing tasks. As a result, a taskAttempt that does not do any progress won't trigger a new speculation.

The file smoothing-exponential.md describes how Simple Exponential smoothing factor works.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

smoothing-exponential.md
29/May/19 20:01
5 kB
Ahmed Hussein
MAPREDUCE-7208.001.patch
30/May/19 13:36
61 kB
Ahmed Hussein
MAPREDUCE-7208.002.patch
29/Oct/19 20:23
83 kB
Ahmed Hussein
MAPREDUCE-7208.003.patch
30/Oct/19 16:09
38 kB
Ahmed Hussein
MAPREDUCE-7208.004.patch
30/Oct/19 17:15
66 kB
Ahmed Hussein
MAPREDUCE-7208-branch-2.10.001.patch
04/Nov/19 20:31
66 kB
Ahmed Hussein
MAPREDUCE-7208-branch-2.10.002.patch
05/Nov/19 16:44
68 kB
Ahmed Hussein

Issue Links

is a parent of

MAPREDUCE-7252 Handling 0 progress in SimpleExponential task runtime estimator

Resolved

relates to

TEZ-4106 Add Exponential Smooth RuntimeEstimator to the speculator

Resolved

Activity

People

Assignee:: Ahmed Hussein

Reporter:: Ahmed Hussein

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 29/May/19 20:08

Updated:: 19/Dec/19 16:55

Resolved:: 05/Nov/19 21:07