Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-5959

Create a ReliableKinesisReceiver similar to the ReliableKafkaReceiver

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 1.1.0
    • None
    • DStreams
    • None

    Description

      After each block is stored reliably in the WAL (after the store() call returns), ACK back to Kinesis.

      There is still the issue of the ReliableKinesisReceiver dying before the ACK back to Kinesis, however no data will be lost. Duplicate data is still possible.

      Notes:

      • Make sure we're not overloading the checkpoint control plane which uses DynamoDB.
      • May need to disable auto-checkpointing and remove the checkpoint interval.
      • Maintain compatibility with existing KinesisReceiver-based code.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            cfregly Chris Fregly
            cfregly Chris Fregly
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment