Type: New Feature
Affects Version/s: 0.10.1.0
Fix Version/s: None
currently Kafka allows consumers to either chose the low level or the high level API, depending on the specific requirements of the consumer implementation. However, I was told that the current low level API (SimpleConsumer) will be deprecated once the new Kafka 0.9 consumer APIs are available.
In this case it would be good, if we can ensure that the new API does offer some ways to get similar performance for use cases which perfectly fit the old SimpleConsumer API approach.
Example Use Case:
A high throughput HTTP API wrapper for consumer requests which gets HTTP REST calls to retrieve data for a specific set of topic partitions and offsets.
Here the SimpleConsumer is perfect because it allows connection pooling in the HTTP API web application with one pool per existing kafka broker and the web application can handle the required metadata managment to know which pool to fetch a connection for, for each used topic partition. This means connections to Kafka brokers can remain open/pooled and connection/reconnection and metadata overhead is minimized.
To achieve something similar with the new Kafka 0.9 consumer APIs, it would be good if it could:
- provide a lowlevel call to connect to a specific broker and to read data from a topic+partition+offset
- ensure that subscribe/unsubscribe calls are very cheap and can run without requiring any network traffic. If I subscribe to a topic partition for which the same broker is the leader as the last topic partition which was in use for this consumer API connection, then the consumer API implementation should recognize this and should not do any disconnects/reconnects and just reuse the existing connection to that kafka broker.
Or put differently, it should be possible to do external metadata handling in the consumer API client and the client should be able to pool consumer API connections effectively by having one pool per Kafka broker.