Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-6537 Umbrella issue for fixes to incremental snapshots
  3. FLINK-6533

Duplicated registration of new shared state when checkpoint confirmations are still pending

    Details

      Description

      Each incremental RocksDB checkpoint n is registering new and existing shared state with the SharedStateRegistry when it completes. Only then, the backend is notified and all following checkpoints (n+x) can reference the new state in checkpoint n.

      However, when a checkpoint n+1 is already starting before n was confirmed to the backend, n+1 can assume some files as new, which were already contained in n. It will upload the file to DFS again, creating a new state handle.

      Then, once n+1 completes, it could to register some state as new, which was previously registered already by n, without n+1 knowing of this. Currently this violates a precondition check, that the reference count for state that is assumed as new is 1.

      While we cannot prevent duplicate uploads, we must resolve this situation in the SharedStateREgistry

        Activity

        Hide
        srichter Stefan Richter added a comment -

        fixed in 4745d0c082

        Show
        srichter Stefan Richter added a comment - fixed in 4745d0c082

          People

          • Assignee:
            srichter Stefan Richter
            Reporter:
            srichter Stefan Richter
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development