Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17938

Backpressure rate not adjusting

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • 2.0.0, 2.0.1
    • None
    • DStreams
    • None

    Description

      spark-streaming 2.0.1 and spark-streaming-kafka-0-10 version is 2.0.1. Same behavior with 2.0.0 though.

      spark.streaming.kafka.consumer.poll.ms is set to 30000
      spark.streaming.kafka.maxRatePerPartition is set to 100000
      spark.streaming.backpressure.enabled is set to true

      `batchDuration` of the streaming context is set to 1 second.

      I consume a Kafka topic using KafkaUtils.createDirectStream().

      My system can handle 100k records batches, but it'd take more than 1 seconds to process them all. I'd thus expect the backpressure to reduce the number of records that would be fetched in the next batch to keep the processing delay inferior to 1 second.

      Only this does not happen and the rate of the backpressure stays the same: stuck in `100.0`, no matter how the other variables change (processing time, error, etc.).

      Here's a log showing how all these variables change but the chosen rate stays the same: https://gist.github.com/Dinduks/d9fa67fc8a036d3cad8e859c508acdba (I would have attached a file but I don't see how).

      Is this the expected behavior and I am missing something, or is this a bug?

      I'll gladly help by providing more information or writing code if necessary.

      Thank you.

      Attachments

        Activity

          People

            Unassigned Unassigned
            samydindane Samy Dindane
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: