Details

    Description

      Currently, changelog state-backend doesn't support local recovery. Thus, recovery times might be sub-optimal.

       

      Materialized state issues:

      Current periodic materialization would call state backend snapshot method with a materialization id. However, current local state managment would rely on checkpoint id as storing, confirming and discarding. The gap between them would break how local recovery works.

       

      Non-materialized state issues:

      • non-materialized state (i.e. changelog) is shared across checkpoints, and therefore needs some tracking (in TM or hard-linking in FS)
      • the writer does not enforce boundary between checkpoints (when writing to DFS); if local stream simply duplicates DFS stream then it would break on cleanup
      • files can be shared across tasks, which will also break on cleanup

      Attachments

        Issue Links

          Activity

            People

              Yanfei Lei Yanfei Lei
              yunta Yun Tang
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: