Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-32566

kafka consumer cache capacity is unclear

    XMLWordPrintableJSON

Details

    • Documentation
    • Status: Closed
    • Major
    • Resolution: Not A Problem
    • 3.0.0
    • None
    • Structured Streaming
    • None

    Description

      The docs mention

      The cache for consumers has a default maximum size of 64.  If you expect
       to be handling more than (64 * number of executors) Kafka partitions, 
      you can change this setting via spark.streaming.kafka.consumer.cache.maxCapacity

      However, for structured streaming, the code seems to expect

      spark.kafka.consumer.cache.capacity/spark.sql.kafkaConsumerCache.capacity

      Would be nice to clear the ambiguity in the documentation or even merge these configurations in the code

      Attachments

        Activity

          People

            Unassigned Unassigned
            _paddy_ srpn
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: