Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-364 Consumer re-design
  3. KAFKA-170

Support for non-blocking polling on multiple streams

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Critical
    • Resolution: Duplicate
    • 0.8.0
    • None
    • core

    Description

      Currently we provide a blocking iterator in the consumer. This is a good mechanism for consuming data from a single topic, but is limited as a mechanism for polling multiple streams.

      For example if one wants to implement a non-blocking union across multiple streams this is hard to do because calls may block indefinitely. A similar situation arrises if trying to implement a streaming join of between two streams.

      I would propose two changes:
      1. Implement a next(timeout) interface on KafkaMessageStream. This will easily handle some simple cases with minimal change. This handles certain limited cases nicely and is easy to implement, but doesn't actually cover the two cases above.
      2. Add an interface to poll streams.

      I don't know the best approach for the later api, but it is important to get it right. One option would be to add a ConsumerConnector.drainTopics("topic1", "topic2", ...) which blocks until there is at least one message and then returns a list of triples (topic, partition, message).

      Attachments

        Activity

          People

            Unassigned Unassigned
            jkreps Jay Kreps
            Votes:
            3 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: