[BEAM-549] SparkRunner should support Beam's KafkaIO instead of providing it's own. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: P2
Resolution: Duplicate
Affects Version/s: None
Fix Version/s: 0.3.0-incubating
Component/s: runner-spark
Labels:
None

Description

For portability, and in the spirit of Apache Beam, the SparkRunner should use the Beam implementation of KafkaIO instead of it's own.

Having said that, the runner will translate the KafkaIO as defined in the pipeline into it's own internal implementation, but should still map the properties the user defined in the pipeline in a way that the IO behaves the same - i.e., brokers, topic, etc.

Eventually, the SparkRunner will implement reading from Kafka using Spark's KafakUtils.createDirectStream() as described here: http://spark.apache.org/docs/1.6.2/streaming-kafka-integration.html#approach-2-direct-approach-no-receivers

Attachments

Issue Links

duplicates

BEAM-658 Support Read.Unbounded primitive

Resolved

links to

GitHub Pull Request #822

Activity

People

Assignee:: Amit Sela

Reporter:: Amit Sela

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 12/Aug/16 21:29

Updated:: 24/Jul/20 20:53

Resolved:: 22/Sep/16 15:38