Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-364 Consumer re-design
  3. KAFKA-170

Support for non-blocking polling on multiple streams

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Critical
    • Resolution: Duplicate
    • Affects Version/s: 0.8.0
    • Fix Version/s: None
    • Component/s: core
    • Labels:

      Description

      Currently we provide a blocking iterator in the consumer. This is a good mechanism for consuming data from a single topic, but is limited as a mechanism for polling multiple streams.

      For example if one wants to implement a non-blocking union across multiple streams this is hard to do because calls may block indefinitely. A similar situation arrises if trying to implement a streaming join of between two streams.

      I would propose two changes:
      1. Implement a next(timeout) interface on KafkaMessageStream. This will easily handle some simple cases with minimal change. This handles certain limited cases nicely and is easy to implement, but doesn't actually cover the two cases above.
      2. Add an interface to poll streams.

      I don't know the best approach for the later api, but it is important to get it right. One option would be to add a ConsumerConnector.drainTopics("topic1", "topic2", ...) which blocks until there is at least one message and then returns a list of triples (topic, partition, message).

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              jkreps Jay Kreps
            • Votes:
              3 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: