Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19968

Use a cached instance of KafkaProducer for writing to kafka via KafkaSink.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.2.0
    • Fix Version/s: 2.2.0
    • Component/s: Structured Streaming
    • Labels:

      Description

      KafkaProducer is thread safe and an instance can be reused for writing every batch out. According to Kafka docs, this sort of usage is encouraged. It has impact on performance too.

      On an average an addBatch operation takes 25ms with this patch. It takes 250+ ms without this patch.

      Results of benchmark results, posted on github PR.

        Attachments

          Activity

            People

            • Assignee:
              prashant_ Prashant Sharma
              Reporter:
              prashant_ Prashant Sharma
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: