[SPARK-25361] Support for Kinesis Client Library 2.0 - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Minor
Resolution: Incomplete
Affects Version/s: 2.3.1
Fix Version/s: None
Component/s: DStreams
Labels:
- aws
- bulk-closed
- http2
- kinesis
- spark
- streaming
- structured

Description

Amazon has recently released version 2.0 of the KCL which provides a HTTP/2 data retrieval API for Kinesis. This API, along with the new enhanced fan-out features, promises better data throughput and faster delivery of records to consumers, specifically in multi-consumer environments.

https://aws.amazon.com/about-aws/whats-new/2018/08/stream_data_65_faster_with_5x_higher_fan_out_using_new_kinesis_data_streams_features/

My organization is very interested in getting support for these features into Spark; is anyone already working on this? I'm happy to give it a go myself - in fact, I'm currently attempting to create my own Spark package for this functionality. Assuming that goes well, it's my intention to port it back to core Spark.

If no one is already working on this, would anyone have any opinions on whether this should be an inplace upgrade for the existing implementation, or should this be a completely separate streaming source (kinesis2, for lack of better name)?

Thanks!

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Cory Locklear

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 06/Sep/18 17:15

Updated:: 12/Dec/22 18:10

Resolved:: 08/Oct/19 05:44