Currently we provide a blocking iterator in the consumer. This is a good mechanism for consuming data from a single topic, but is limited as a mechanism for polling multiple streams.
For example if one wants to implement a non-blocking union across multiple streams this is hard to do because calls may block indefinitely. A similar situation arrises if trying to implement a streaming join of between two streams.
I would propose two changes:
1. Implement a next(timeout) interface on KafkaMessageStream. This will easily handle some simple cases with minimal change. This handles certain limited cases nicely and is easy to implement, but doesn't actually cover the two cases above.
2. Add an interface to poll streams.
I don't know the best approach for the later api, but it is important to get it right. One option would be to add a ConsumerConnector.drainTopics("topic1", "topic2", ...) which blocks until there is at least one message and then returns a list of triples (topic, partition, message).