Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.0.0, 1.2.0
Description
I think we can ease migration for people looking to move from storm-kafka to storm-kafka-client. We should be able to support migrating offsets from the old spout by setting some extra configuration in KafkaSpoutConfig, and by adding a new FirstPollOffsetStrategy (e.g. something like FirstPollOffsetStrategy.UNCOMMITTED_MIGRATE_FROM_STORM_KAFKA).
The old spout stores offsets in Storm's Zookeeper at one of two paths. The storm-kafka SpoutConfig has two parameters we'll also need, namely zkRoot and id. In addition we need to know if the storm-kafka subscription was a wildcard subscription or not.
The zookeeper path for commit info is
zkRoot + "/" + id + "/" + topicName + "partition_" + partition
if the subscription was a wildcard. Otherwise it is
zkRoot + "/" + id + "/" + "partition_" + partition
We can get topicName and partition numbers from the KafkaConsumer.assignment. When we run initialize, we should be able to read the old offset structure from Zookeeper when the strategy is UNCOMMITTED_MIGRATE_FROM_STORM_KAFKA, and seek the consumer to those offsets. We can just crash if the offsets are not present.
I'd be okay with this feature not being permanent, but I think this feature would make it a lot easier for people to move off the old spout.