[SPARK-8882] A New Receiver Scheduling Mechanism to solve unbalanced receivers - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.5.0
Component/s: DStreams
Labels:
None

Description

There are some problems in the current mechanism:

If a task fails more than “spark.task.maxFailures” (default: 4) times, the job will fail. For a long-running Streaming applications, it’s possible that a Receiver task fails more than 4 times because of Executor lost.
When an executor is lost, the Receiver tasks on it will be rescheduled. However, because there may be many Spark jobs at the same time, it’s possible that TaskScheduler cannot schedule them to make Receivers be distributed evenly.

To solve such limitations, we need to change the receiver scheduling mechanism. Here is the design doc: https://docs.google.com/document/d/1ZsoRvHjpISPrDmSjsGzuSu8UjwgbtmoCTzmhgTurHJw/edit?usp=sharing

Attachments

Issue Links

supercedes

SPARK-3283 Receivers sometimes do not get spread out to multiple nodes

Resolved

SPARK-7942 Receiver's life cycle is inconsistent with streaming job.

Resolved

links to

[Github] Pull Request #7276 (zsxwing)

Activity

People

Assignee:: Shixiong Zhu

Reporter:: Shixiong Zhu

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 08/Jul/15 00:20

Updated:: 23/Sep/15 02:42

Resolved:: 28/Jul/15 01:00