Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-50160

KafkaWriteTask: support timestamp customisation

    XMLWordPrintableJSON

Details

    Description

      Currently, there is no way to customise the timestamp of a ProducerRecord produced by the KafkaWriteTask. Here at Wikimedia we often use event-time semantics, so it would be helpful if Kafka records produced via spark could resemble that.

      The producer already allows that as stated in the Producer.send API docs:

      If CreateTime is used by the topic, the timestamp will be the user provided timestamp or the record send time if the user did not specify a timestamp for the record.

      So the proposed feature enables users of the spark kafka output to specify the create-time.

      I already checked out the code and was able to adapted it to fulfil our need. Since I couldn't find any (closed) ticket concerned with this topic, I assume no such feature has been denied until now.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              schulzp Peter Schulz
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: