[SPARK-7398] Add back-pressure to Spark Streaming (umbrella JIRA) - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Critical
Resolution: Incomplete
Affects Version/s: 1.3.1
Fix Version/s: None
Component/s: DStreams
Labels:
- bulk-closed
- streams

Description

Spark Streaming has trouble dealing with situations where
batch processing time > batch interval
Meaning a high throughput of input data w.r.t. Spark's ability to remove data from the queue.

If this throughput is sustained for long enough, it leads to an unstable situation where the memory of the Receiver's Executor is overflowed.

This aims at transmitting a back-pressure signal back to data ingestion to help with dealing with that high throughput, in a backwards-compatible way.

The original design doc can be found here:
https://docs.google.com/document/d/1ZhiP_yBHcbjifz8nJEyPJpHqxB1FT6s8-Zk7sAfayQw/edit?usp=sharing

The second design doc, focusing on the first sub-task (without all the background info, and more centered on the implementation) can be found here:
https://docs.google.com/document/d/1ls_g5fFmfbbSTIfQQpUxH56d0f3OksF567zwA00zK9E/edit?usp=sharing

Attachments

Issue Links

relates to

SPARK-10420 Implementing Reactive Streams based Spark Streaming Receiver

Resolved

supercedes

SPARK-6691 Abstract and add a dynamic RateLimiter for Spark Streaming

Resolved

Sub-Tasks

1.	Implement a mechanism to send a new rate from the driver to the block generator	Resolved	Dragos Dascalita Haut
2.	Define the RateEstimator interface, and implement the ReceiverRateController	Resolved	Dragos Dascalita Haut
3.	Implement a PIDRateEstimator	Resolved	Dragos Dascalita Haut
4.	Implement the DirectKafkaRateController	Resolved	Dragos Dascalita Haut
5.	Make all BlockGenerators subscribe to rate limit updates	Resolved	Tathagata Das
6.	Handle a couple of corner cases in the PID rate estimator	Resolved	Tathagata Das
7.	BlockGenerator lock structure can cause lock starvation of the block updating thread	Resolved	Tathagata Das
8.	Rename the SparkConf property to spark.streaming.backpressure.{enable --> enabled}	Resolved	Tathagata Das
9.	Provide pluggable Congestion Strategies to deal with Streaming load	Resolved	Unassigned

Activity

People

Assignee:: Tathagata Das

Reporter:: François Garillot

Shepherd:: Tathagata Das

Votes:: 14 Vote for this issue

Watchers:: 32 Start watching this issue

Dates

Created:: 06/May/15 12:48

Updated:: 21/May/19 04:32

Resolved:: 21/May/19 04:32