Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-12047

State Processor API (previously named Savepoint Connector) to read / write / process savepoints

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: API / State Processor
    • Labels:
      None

      Description

      This JIRA tracks the ongoing efforts and discussions about a means to read / write / process state in savepoints.

      There are already two known existing works (that was mentioned already in the mailing lists) related to this:
      1. Bravo [1]
      2. https://github.com/sjwiesman/flink/tree/savepoint-connector

      Essentially, the two tools both provide a connector to read or write a Flink savepoint, and allows to utilize Flink's processing APIs for querying / processing the state in the savepoint.

      We should try to converge the efforts on this, and have a savepoint connector like this in Flink.
      With this connector, the high-level benefits users should be able to achieve with it are:
      1. Create savepoints using existing data from other systems (i.e. bootstrapping a Flink job's state with data in an external database).
      2. Derive new state using existing state
      3. Query state in savepoints, for example for debugging purposes
      4. Migrate schema of state in savepoints offline, compared to the current more limited approach of online migration on state access.
      5. Change max parallelism of jobs, or any other kind of fixed configuration, such as operator uids.

      [1] https://github.com/king/bravo

        Attachments

          Activity

            People

            • Assignee:
              sjwiesman Seth Wiesman
              Reporter:
              tzulitai Tzu-Li (Gordon) Tai
            • Votes:
              5 Vote for this issue
              Watchers:
              30 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1.5h
                1.5h