[SAMZA-252] Document stream reprocessing - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 0.6.0
Fix Version/s: 0.7.0
Component/s: docs
Labels:
None

Description

A need with stream processing is to want to re-process prior messages at some later date. An example of this is having a stream processing job that is classifying messages in some way using a machine learning algorithm. At some point, the algorithm will be updated with a more accurate vector of weights. When this happens, usually you wish to re-process past messages to get more accurate results. Usually this is solved by running a parallel pipeline from Hadoop.

We have thought extensively about this use case, and should document how to use Samza in a re-processing use case.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

SAMZA-252.1.patch
01/May/14 22:14
9 kB
Martin Kleppmann
SAMZA-252.2.patch
13/Jun/14 17:45
15 kB
Martin Kleppmann

Activity

People

Assignee:: Martin Kleppmann

Reporter:: Chris Riccomini

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 28/Apr/14 16:33

Updated:: 13/Jun/14 22:22

Resolved:: 13/Jun/14 22:22