[SPARK-6249] Get Kafka offsets from consumer group in ZK when using direct stream - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Won't Fix
Affects Version/s: None
Fix Version/s: None
Component/s: DStreams
Labels:
None

Target Version/s:

1.4.0

Description

This is the proposal.

The simpler direct API (the one that does not take explicit offsets) can be modified to also pick up the initial offset from ZK if group.id is specified. This is exactly similar to how we find the latest or earliest offset in that API, just that instead of latest/earliest offset of the topic we want to find the offset from the consumer group. The group offsets is ZK is not used at all for any further processing and restarting, so the exactly-once semantics is not broken.

The use case where this is useful is simplified code upgrade. If the user wants to upgrade the code, he/she can the context stop gracefully which will ensure the ZK consumer group offset will be updated with the last offsets processed. Then the new code is started (not restarted from checkpoint) can pickup the consumer group offset from ZK and continue where the previous code had left off.

Without the functionality of picking up consumer group offsets to start (that is, currently) the only way to do this is for the users to save the offsets somewhere (file, database, etc.) and manage the offsets themselves. I just want to simplify this process.

Attachments

Issue Links

is cloned by

SPARK-9947 Separate Metadata and State Checkpoint Data

Closed

is duplicated by

SPARK-9434 Need how-to for resuming direct Kafka streaming consumers where they had left off before getting terminated, OR actual support for that mode in the Streaming API

Resolved

SPARK-8833 Kafka Direct API support offset in zookeeper

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Tathagata Das

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 10/Mar/15 18:48

Updated:: 14/Aug/15 15:13

Resolved:: 30/Apr/15 08:08