Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-2365 Copycat checklist
  3. KAFKA-2481

Allow copycat sinks to request periodic invocation of put even if no new data is available



    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s:
    • Component/s: KafkaConnect
    • Labels:


      Some connectors will need to perform actions periodically (or more generally, schedule actions in the future). For example, in an HDFS connector, if you want to roll files every n minutes, the sink connector needs to make sure it gets control every n minutes, regardless of availbable data. However, if data isn't flowing into the consumer, we might never invoke put(records). Another variant of this is for connectors that might have an API like the new consumer's where `poll()` needs to be invoked regularly.

      In terms of design, I think there are at least two options:
      1. this could be handled via the context, so it is purely opt in to ask to be scheduled for a put(), and they can specify exactly the timeout
      2. alternatively, could be returned by put() since the return type is currently void. we aren't using a return value right now, but this does mean everyone has to return. also, unclear that this will always be the only info you want to return

      I think 1 is cleaner and doesn't require connector developers who don't care about the feature to even know about it.




            • Assignee:
              liquanpei Liquan Pei
              ewencp Ewen Cheslack-Postava
              Gwen Shapira
            • Votes:
              0 Vote for this issue
              2 Start watching this issue


              • Created: