Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-7153

Increasing Kafka minPartitions in Streamer causes corrupted offsets

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Blocker
    • Resolution: Unresolved
    • None
    • 1.1.0
    • None

    Description

      After increasing `hoodie.deltastreamer.source.kafka.minPartitions` for Hudi Streamer with Kafka source, at some point, the streamer job fails due to the corrupted offset (see below), though the topic is not damaged.  Removing the config can get over the issue.

      Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 106 in stage 6011.0 failed 4 times, most recent failure: Lost task 106.3 in stage 6011.0 (TID 429501) (10.151.141.115 executor 28): java.lang.IllegalArgumentException: requirement failed: Beginning offset -9223372036497672008 is after the ending offset -9223372036497705858 for topic *** partition 0. You either provided an invalid fromOffset, or the Kafka topic has been damaged
          at scala.Predef$.require(Predef.scala:281)
          at org.apache.spark.streaming.kafka010.KafkaRDD.compute(KafkaRDD.scala:186)
          at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) 

      Attachments

        Issue Links

          Activity

            People

              shivnarayan sivabalan narayanan
              guoyihua Ethan Guo
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: