Description
Usability improvements:
API improvements, AWS SDK upgrades, etc.
Reliability improvements:
Currently, the KinesisReceiver can loose some data in the case of certain failures (receiver and driver failures). Using the write ahead logs can mitigate some of the problem, but it is not ideal because WALs dont work with S3 (eventually consistency, etc.) which is the most likely file system to be used in the EC2 environment. Hence, we have to take a different approach to improving reliability for Kinesis. See https://issues.apache.org/jira/browse/SPARK-9215 for more details.
Attachments
Issue Links
- relates to
-
SPARK-3638 Commons HTTP client dependency conflict in extras/kinesis-asl module
-
- Resolved
-
-
SPARK-7679 Update AWS SDK and KCL versions to 1.2.1
-
- Resolved
-
-
SPARK-5960 Allow AWS credentials to be passed to KinesisUtils.createStream()
-
- Resolved
-
-
SPARK-6514 For Kinesis Streaming, use the same region for DynamoDB (KCL checkpoints) as the Kinesis stream itself
-
- Resolved
-
-
SPARK-6656 Allow the application name to be passed in versus pulling from SparkContext.getAppName()
-
- Resolved
-
-
SPARK-9215 Implement WAL-free Kinesis receiver that give at-least once guarantee
-
- Resolved
-
-
SPARK-6654 Update Kinesis Streaming impls (both KCL-based and Direct) to use latest aws-java-sdk and kinesis-client-library
-
- Closed
-
-
SPARK-9030 Add Kinesis.createStream unit tests that actual sends data
-
- Resolved
-
-
SPARK-4184 Improve Spark Streaming documentation to address commonly-asked questions
-
- Closed
-