Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.7.0
-
None
-
None
-
None
Description
There are a few configuration parameters which control the offset at which a consumer starts:
- systems.*.samza.reset.offset (whether to ignore checkpoints on container startup)
- systems.*.samza.offset.default (what to do if there is no checkpoint)
- systems.*.consumer.auto.offset.reset (what to do if the requested offset is out of range of the broker's stream history)
- CheckpointTool isn't a config per se, but is also related to consumer offsets
Although they are all valid, they are not really great. The parameter names are a bit obscure (I still don't remember them, even though I've been staring at them for some time), there are subtle interactions between them, and generally I feel they are set up from the framework's internals' point of view, rather than the "what is the job trying to accomplish" point of view. (Put another way, you need to understand how Samza works internally in order to make sense of them.)
I don't have an answer of what a better design would look like. This ticket is just a place to discuss how we could make offset-related configuration easier for job authors to understand and use.