Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26396

Kafka consumer cache overflow since 2.4.x

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • 2.4.0
    • None
    • Structured Streaming
    • None
    • Spark 2.4 standalone client mode

    Description

      We are experiencing an issue where the Kafka consumer cache seems to overflow constantly upon starting the application. This issue appeared after upgrading to Spark 2.4.

      We would get constant warnings like this:

      18/12/18 07:03:29 WARN KafkaDataConsumer: KafkaConsumer cache hitting max capacity of 180, removing consumer for CacheKey(spark-kafka-source-6f66e0d2-beaf-4ff2-ade8-8996611de6ae--1081651087-executor,kafka-topic-76)
      18/12/18 07:03:32 WARN KafkaDataConsumer: KafkaConsumer cache hitting max capacity of 180, removing consumer for CacheKey(spark-kafka-source-6f66e0d2-beaf-4ff2-ade8-8996611de6ae--1081651087-executor,kafka-topic-30)
      18/12/18 07:03:32 WARN KafkaDataConsumer: KafkaConsumer cache hitting max capacity of 180, removing consumer for CacheKey(spark-kafka-source-f41d1f9e-1700-4994-9d26-2b9c0ee57881--215746753-executor,kafka-topic-57)
      18/12/18 07:03:32 WARN KafkaDataConsumer: KafkaConsumer cache hitting max capacity of 180, removing consumer for CacheKey(spark-kafka-source-f41d1f9e-1700-4994-9d26-2b9c0ee57881--215746753-executor,kafka-topic-43)
      

      This application is running 4 different Spark Structured Streaming queries against the same Kafka topic that has 90 partitions. We used to run it with just the default settings so it defaulted to cache size 64 on Spark 2.3 but now we tried to put it to 180 or 360. With 360 we will have a lot less noise about the overflow but resource need will increase substantially.

      Attachments

        Activity

          People

            Unassigned Unassigned
            Tint Kaspar Tint
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: