Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-549

SparkRunner should support Beam's KafkaIO instead of providing it's own.

Details

    • Bug
    • Status: Resolved
    • P2
    • Resolution: Duplicate
    • None
    • 0.3.0-incubating
    • runner-spark
    • None

    Description

      For portability, and in the spirit of Apache Beam, the SparkRunner should use the Beam implementation of KafkaIO instead of it's own.

      Having said that, the runner will translate the KafkaIO as defined in the pipeline into it's own internal implementation, but should still map the properties the user defined in the pipeline in a way that the IO behaves the same - i.e., brokers, topic, etc.

      Eventually, the SparkRunner will implement reading from Kafka using Spark's KafakUtils.createDirectStream() as described here: http://spark.apache.org/docs/1.6.2/streaming-kafka-integration.html#approach-2-direct-approach-no-receivers

      Attachments

        Issue Links

          Activity

            People

              amitsela Amit Sela
              amitsela Amit Sela
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: