Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-3129

Prevent data loss in Spark Streaming on driver failure using Write Ahead Logs

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 1.0.0, 1.0.1, 1.0.2, 1.0.3, 1.1.0
    • 1.2.0
    • DStreams
    • None

    Description

      Spark Streaming can small amounts of data when the driver goes down - and the sending system cannot re-send the data (or the data has already expired on the sender side). This currently affects all receivers.

      The solution we propose is to reliably store all the received data into HDFS. This will allow the data to persist through driver failures, and therefore can be processed when the driver gets restarted.

      The high level design doc for this feature is given here.
      https://docs.google.com/document/d/1vTCB5qVfyxQPlHuv8rit9-zjdttlgaSrMgfCDQlCJIM/edit?usp=sharing

      This major task has been divided in sub-tasks

      • Implementing a write ahead log management system that can manage rolling write ahead logs - write to log, recover on failure and clean up old logs
      • Implementing a HDFS backed block RDD that can read data either from Spark's BlockManager or from HDFS files
      • Implementing a ReceivedBlockHandler interface that abstracts out the functionality of saving received blocks
      • Implementing a ReceivedBlockTracker and other associated changes in the driver that allows metadata of received blocks and block-to-batch allocations to be recovered upon driver retart

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            tdas Tathagata Das
            hshreedharan Hari Shreedharan
            Votes:
            1 Vote for this issue
            Watchers:
            13 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment