Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-2896

Support automatic migration of offsets from storm-kafka to storm-kafka-client KafkaSpout

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

      Description

      I think we can ease migration for people looking to move from storm-kafka to storm-kafka-client. We should be able to support migrating offsets from the old spout by setting some extra configuration in KafkaSpoutConfig, and by adding a new FirstPollOffsetStrategy (e.g. something like FirstPollOffsetStrategy.UNCOMMITTED_MIGRATE_FROM_STORM_KAFKA).

      The old spout stores offsets in Storm's Zookeeper at one of two paths. The storm-kafka SpoutConfig has two parameters we'll also need, namely zkRoot and id. In addition we need to know if the storm-kafka subscription was a wildcard subscription or not.

      The zookeeper path for commit info is

      zkRoot + "/" + id + "/" + topicName + "partition_" + partition
      

      if the subscription was a wildcard. Otherwise it is

      zkRoot + "/" + id + "/" + "partition_" + partition
      

      We can get topicName and partition numbers from the KafkaConsumer.assignment. When we run initialize, we should be able to read the old offset structure from Zookeeper when the strategy is UNCOMMITTED_MIGRATE_FROM_STORM_KAFKA, and seek the consumer to those offsets. We can just crash if the offsets are not present.

      I'd be okay with this feature not being permanent, but I think this feature would make it a lot easier for people to move off the old spout.

        Attachments

          Activity

          $i18n.getText('security.level.explanation', $currentSelection) Viewable by All Users
          Cancel

            People

            • Assignee:
              srdo Stig Rohde Døssing Assign to me
              Reporter:
              srdo Stig Rohde Døssing

              Dates

              • Created:
                Updated:
                Resolved:

              Time Tracking

              Estimated:
              Original Estimate - Not Specified
              Not Specified
              Remaining:
              Remaining Estimate - 0h
              0h
              Logged:
              Time Spent - 2h 10m
              2h 10m

                Issue deployment