Details
-
Documentation
-
Status: Closed
-
Major
-
Resolution: Not A Problem
-
3.0.0
-
None
-
None
Description
The docs mention
The cache for consumers has a default maximum size of 64. If you expect to be handling more than (64 * number of executors) Kafka partitions, you can change this setting via spark.streaming.kafka.consumer.cache.maxCapacity
However, for structured streaming, the code seems to expect
spark.kafka.consumer.cache.capacity/spark.sql.kafkaConsumerCache.capacity
Would be nice to clear the ambiguity in the documentation or even merge these configurations in the code