Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26121

[Structured Streaming] Allow users to define prefix of Kafka's consumer group (group.id)

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.4.0
    • 3.0.0
    • Structured Streaming
    • None

    Description

      I run in the following situation with Spark Structure Streaming (SS) using Kafka.
       
      In a project that I work on, there is already a secured Kafka setup where ops can issue an SSL certificate per "group.id", which should be predefined (or its prefix to be predefined).
       
      On the other hand, Spark SS fixes the group.id to 
       
      val uniqueGroupId = s"spark-kafka-source-${UUID.randomUUID}-${metadataPath.hashCode}"
       
      see, i.e.,
       
      https://github.com/apache/spark/blob/v2.4.0/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala#L124
       
      https://github.com/apache/spark/blob/v2.4.0/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala#L81
       
      I guess Spark developers had a good reason to fix it, but is it possible to make configurable the prefix of the above uniqueGroupId ("spark-kafka-source")?
       
      The rational is that spark users are not forced to use the same certificate on group-ids of the form (spark-kafka-source-*).
       
      DoD:

      • Allow spark SS users to define the group.id prefix as input parameter.

      Attachments

        Activity

          People

            zouzias Anastasios Zouzias
            zouzias Anastasios Zouzias
            Cody Koeninger Cody Koeninger
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: