Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-2914

Remove enable.auto.commit support from storm-kafka-client

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      The enable.auto.commit option causes the KafkaConsumer to periodically commit the latest offsets it has returned from poll(). It is convenient for use cases where messages are polled from Kafka and processed synchronously, in a loop.

      Due to https://issues.apache.org/jira/browse/STORM-2913 we'd really like to store some metadata in Kafka when the spout commits. This is not possible with enable.auto.commit. I took at look at what that setting actually does, and it just causes the KafkaConsumer to call commitAsync during poll (and during a few other operations, e.g. close and assign) with some interval.

      Ideally I'd like to get rid of ProcessingGuarantee.NONE, since I think ProcessingGuarantee.AT_MOST_ONCE covers the same use cases, and is likely almost as fast. The primary difference between them is that AT_MOST_ONCE commits synchronously.

      If we really want to keep ProcessingGuarantee.NONE, I think we should make our ProcessingGuarantee.NONE setting cause the spout to call commitAsync after poll, and never use the enable.auto.commit option. This allows us to include metadata in the commit.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            srdo Stig Rohde Døssing
            srdo Stig Rohde Døssing
            Votes:
            2 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 4h 40m
                4h 40m

                Slack

                  Issue deployment