Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-12693

OffsetOutOfRangeException caused by retention

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Not A Problem
    • 1.6.0
    • None
    • DStreams
    • Ubuntu 64bit, Intel i7

    Description

      I am running Kafka server locally with extremely low retention of 3 seconds and with 1 second segmentation. I create direct Kafka stream with auto.offset.reset = smallest.

      In case of bad luck (happens actually quite often in my case) the smallest offset retrieved druing stream initialization doesn't already exists when streaming actually starts.

      Complete source code of the Spark Streaming application is here:
      https://github.com/pygmalios/spark-checkpoint-experience/blob/cb27ab83b7a29e619386b56e68a755d7bd73fc46/src/main/scala/com/pygmalios/sparkCheckpointExperience/spark/SparkApp.scala

      The application ends in an endless loop trying to get that non-existing offset and has to be killed. Check attached logs from Spark and also from Kafka server.

      Attachments

        1. kafka-log.txt
          4 kB
          Rado Buransky
        2. log.txt
          9 kB
          Rado Buransky

        Activity

          People

            Unassigned Unassigned
            radoburansky Rado Buransky
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: