Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-25315

setting "auto.offset.reset" to "earliest" has no effect in Structured Streaming with Spark 2.3.1 and Kafka 1.0

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Bug
    • 2.3.1
    • None
    • Structured Streaming
    • None
    • Standalone; running in IDEA

    Description

      The following code won't read from the beginning of the topic

      ```

      val kafkaOptions = Map[String, String](
       "kafka.bootstrap.servers" -> KAFKA_BOOTSTRAP_SERVERS,
       "subscribe" -> TOPIC,
       "group.id" -> GROUP_ID,
       "auto.offset.reset" -> "earliest"
      )
      
      val myStream = sparkSession
          .readStream
          .format("kafka")
          .options(kafkaOptions)
          .load()
          .selectExpr("CAST(key AS STRING)", "CAST(value AS STRING)")
      
        myStream
          .writeStream
          .format("console")
          .start()
          .awaitTermination()
      

      ```

      Attachments

        Activity

          People

            Unassigned Unassigned
            zhenhao.li Zhenhao Li
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: