Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26360

Avoid extra validateQuery call in createStreamingWriteSupport

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Trivial
    • Resolution: Fixed
    • 2.3.2, 2.4.0
    • 3.0.0
    • Structured Streaming
    • None

    Description

      When I'm reading structured streaming source code, I find there is a redundant KafkaWriter.validateQuery() function call in createStreamingWriteSupport func in class `KafkaSourceProvider`.

      // KafkaSourceProvider.scala
        override def createStreamingWriteSupport(
            queryId: String,
            schema: StructType,
            mode: OutputMode,
            options: DataSourceOptions): StreamingWriteSupport = {
         .....
          // validate once here
          KafkaWriter.validateQuery(schema.toAttributes, producerParams, topic)
      
          // validate twice here
          new KafkaStreamingWriteSupport(topic, producerParams, schema)
        }
      
      // KafkaStreamingWriteSupport.scala
      class KafkaStreamingWriteSupport(
          topic: Option[String],
          producerParams: ju.Map[String, Object],
          schema: StructType)
        extends StreamingWriteSupport {
      
        validateQuery(schema.toAttributes, producerParams, topic)
        ....
      }
      

       

      I think we just need to remove one of these two.

      Attachments

        Issue Links

          Activity

            People

              jasonwayne Wu Wenjie
              jasonwayne Wu Wenjie
              Sean R. Owen Sean R. Owen
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: