Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Implemented
-
None
-
None
-
None
Description
Problem:
**At LinkedIn we noticed jobs with large states takes a long time to restore (in the tune of hours) from kafka based changelog.
Solution:
**We propose a blob store based backup and restore for stateful jobs. Advantage of such a system is the ability to backup and restore state in parallel rather than one message at a time approach for a kafka based changelog. We implement a pluggable system that allows various blob stores that support PUT/GET/DELETE APIs to be easily plugged in as the backend for Samza state backup and restore.
Note:
At this time a general interface for Blob stores is provided for users and community to implement details of different blob store specific details.