Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-7526

Allow for not throwing away prefetched data of paused partitions

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Trivial
    • Resolution: Unresolved
    • None
    • None
    • consumer
    • None

    Description

      Kafka consumer pipelines the fetching of data in order to maximise performance. Whenever poll(Duration)/poll(long) is called before any results is returned, another fetch is issued. Albeit benefitting performance, in some circumstances when combined with the use of the pause/resume API, this optimisation can result in transferring quite a bit of duplicate data over the wire. The reason for this to happen is that whenever poll is called any prefetched data is thrown away in case the topic-partition is paused. To illustrate the effect with a simple example, imagine that a single KafkaConsumer instance is assigned two topic partitions TP1 and TP2. Since the client interested in TP1 cannot handle records as fast than the one in TP2, we resort to pausing TP1 whenever we are not interested in receiving records for it. This results in the following behavior:

      1. TP1 is resumed and poll is called on it, where poll returns some data
      2. The consumer issues a fetch request in order to pre-fetch the next batch of records for TP1
      3. TP2 is resumed and TP1 paused (as the consumer of TP1 is not ready for more records)
      4. All prefetched records for TP1 are now thrown away.
      5. This cycle repeats indefinitely

      This KIP proposes an improvement that allows us to control whether we want to instead of throwing away the prefetched data, simply return it along with the rest of the records coming from partitions that are not in paused state. 

      Attachments

        Activity

          People

            Unassigned Unassigned
            zaharidichev Zahari Dichev
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: