Uploaded image for project: 'Samza'
  1. Samza
  2. SAMZA-2657

Blob store backed state backup and restore

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Implemented
    • None
    • 1.7
    • None
    • None

    Description

      Problem:

      **At LinkedIn we noticed jobs with large states takes a long time to restore (in the tune of hours) from kafka based changelog. 

      Solution

      **We propose a blob store based backup and restore for stateful jobs. Advantage of such a system is the ability to backup and restore state in parallel rather than one message at a time approach for a kafka based changelog. We implement a pluggable system that allows various blob stores that support PUT/GET/DELETE APIs to be easily plugged in as the backend for Samza state backup and restore.

      Note:

      At this time a general interface for Blob stores is provided for users and community to implement details of different blob store specific details. 

      Attachments

        Activity

          People

            shekhars-li Shekhar Sharma
            shekhars-li Shekhar Sharma
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 3.5h
                3.5h