Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-8934

Store&Read offset with KafkaIO

Details

    • Important

    Description

      When creating a Pipeline through a KafkaIO object, I want to be able to specify the starting offset of consumption, and when traversing the message later, I can get the offset of the current message for storage in a relational database / NoSQL.
       
      This feature is used to implement the exactly-once semantics of spark streaming consumption.
       
      In the "Your own data store" section of the following url content, you can find how to achieve exactly-once semantics with spark streaming:
      http://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html

      Attachments

        Activity

          People

            Unassigned Unassigned
            zjfplayer jiefeng zheng

            Dates

              Created:
              Updated:

              Slack

                Issue deployment