[SPARK-3129] Prevent data loss in Spark Streaming on driver failure using Write Ahead Logs - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: 1.0.0, 1.0.1, 1.0.2, 1.0.3, 1.1.0
Fix Version/s: 1.2.0
Component/s: DStreams
Labels:
None

Target Version/s:

1.2.0

Description

Spark Streaming can small amounts of data when the driver goes down - and the sending system cannot re-send the data (or the data has already expired on the sender side). This currently affects all receivers.

The solution we propose is to reliably store all the received data into HDFS. This will allow the data to persist through driver failures, and therefore can be processed when the driver gets restarted.

The high level design doc for this feature is given here.
https://docs.google.com/document/d/1vTCB5qVfyxQPlHuv8rit9-zjdttlgaSrMgfCDQlCJIM/edit?usp=sharing

This major task has been divided in sub-tasks

Implementing a write ahead log management system that can manage rolling write ahead logs - write to log, recover on failure and clean up old logs
Implementing a HDFS backed block RDD that can read data either from Spark's BlockManager or from HDFS files
Implementing a ReceivedBlockHandler interface that abstracts out the functionality of saving received blocks
Implementing a ReceivedBlockTracker and other associated changes in the driver that allows metadata of received blocks and block-to-batch allocations to be recovered upon driver retart

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

SecurityFix.diff
08/Sep/14 18:30
1 kB
Hari Shreedharan

Issue Links

is duplicated by

SPARK-1647 Prevent data loss when Streaming driver goes down

Closed

is related to

SPARK-4062 Improve KafkaReceiver to prevent data loss

Resolved

Sub-Tasks

1.	Write ahead log management	Resolved	Hari Shreedharan
2.	Write Ahead Log backed Block RDD	Resolved	Hari Shreedharan
3.	ReceivedBlockHandler interface to abstract the functionality of storage of received data	Resolved	Tathagata Das
4.	Update streaming driver to reliably save and recover received block metadata on driver failures	Resolved	Tathagata Das
5.	Add testsuite with end-to-end testing of driver failure	Resolved	Tathagata Das

Activity

People

Assignee:: Tathagata Das

Reporter:: Hari Shreedharan

Votes:: 1 Vote for this issue

Watchers:: 13 Start watching this issue

Dates

Created:: 19/Aug/14 17:02

Updated:: 05/Mar/15 20:55

Resolved:: 14/Nov/14 22:36