Details

      Description

      Currently, the job id is part of the registration key in the SharedStateRegistry. I suggest to remove this part of the key because the job id changes after a restart. So when we do not update job ids in the registry, referencing shared state will fail for future checkpoints. When we update it, we basically replace the information. So I think there would be no value in addint this the the composite key.

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user StefanRRichter opened a pull request:

          https://github.com/apache/flink/pull/3870

          [Flink 6537] Fixes and improvements for incremental checkpoints in RocksDB

          This PR bundles several fixes and improvements for incremental checkpoints in RocksDB.

          In particular, this addresses:

          • FLINK-6535 : JobID should not be part of the registration key to the SharedStateRegistry
          • FLINK-6533 : Duplicated registration of new shared state when checkpoint confirmations are still pending
          • FLINK-6527 : OperatorSubtaskState has empty implementations of (un)/registerSharedStates
          • FLINK-6504 : Lack of synchronization on materializedSstFiles in RocksDBKEyedStateBackend

          It also gives a foundation for FLINK-6534, extended test coverage will be provided as part of FLINK-6540.

          Some of the main changes are in the way the `SharedStateRegistry` works. It is now able to detect and resolve duplicate state registrations and to serve previously registered state by key. This way, we can avoid resending already registered state handles in the RPC, and can just send their registration keys instead.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/StefanRRichter/flink FLINK-6537-part-1

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/flink/pull/3870.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #3870


          commit 3585e224ab0e021573fe5eea582dbb7cfb1fef91
          Author: Stefan Richter <s.richter@data-artisans.com>
          Date: 2017-05-10T12:57:55Z

          FLINK-6527 [checkpoint] OperatorSubtaskState has empty implementations of (un)/registerSharedStates

          commit 6c22eca0809d9d5d6bb14950cd46b50ae2f9cf86
          Author: Stefan Richter <s.richter@data-artisans.com>
          Date: 2017-05-10T15:59:39Z

          FLINK-6537 [checkpoint] First set of fixes for (de)registration of shared state in incremental checkpoints

          commit d462e17e2b41cfb6b4a7a4fa8477c631f84106f6
          Author: Stefan Richter <s.richter@data-artisans.com>
          Date: 2017-05-11T09:59:47Z

          FLINK-6504 [checkpoint] Fix synchronization on materializedSstFiles in RocksDBKeyedStateBackend


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user StefanRRichter opened a pull request: https://github.com/apache/flink/pull/3870 [Flink 6537] Fixes and improvements for incremental checkpoints in RocksDB This PR bundles several fixes and improvements for incremental checkpoints in RocksDB. In particular, this addresses: FLINK-6535 : JobID should not be part of the registration key to the SharedStateRegistry FLINK-6533 : Duplicated registration of new shared state when checkpoint confirmations are still pending FLINK-6527 : OperatorSubtaskState has empty implementations of (un)/registerSharedStates FLINK-6504 : Lack of synchronization on materializedSstFiles in RocksDBKEyedStateBackend It also gives a foundation for FLINK-6534 , extended test coverage will be provided as part of FLINK-6540 . Some of the main changes are in the way the `SharedStateRegistry` works. It is now able to detect and resolve duplicate state registrations and to serve previously registered state by key. This way, we can avoid resending already registered state handles in the RPC, and can just send their registration keys instead. You can merge this pull request into a Git repository by running: $ git pull https://github.com/StefanRRichter/flink FLINK-6537 -part-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3870.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3870 commit 3585e224ab0e021573fe5eea582dbb7cfb1fef91 Author: Stefan Richter <s.richter@data-artisans.com> Date: 2017-05-10T12:57:55Z FLINK-6527 [checkpoint] OperatorSubtaskState has empty implementations of (un)/registerSharedStates commit 6c22eca0809d9d5d6bb14950cd46b50ae2f9cf86 Author: Stefan Richter <s.richter@data-artisans.com> Date: 2017-05-10T15:59:39Z FLINK-6537 [checkpoint] First set of fixes for (de)registration of shared state in incremental checkpoints commit d462e17e2b41cfb6b4a7a4fa8477c631f84106f6 Author: Stefan Richter <s.richter@data-artisans.com> Date: 2017-05-11T09:59:47Z FLINK-6504 [checkpoint] Fix synchronization on materializedSstFiles in RocksDBKeyedStateBackend
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user StefanRRichter closed the pull request at:

          https://github.com/apache/flink/pull/3870

          Show
          githubbot ASF GitHub Bot added a comment - Github user StefanRRichter closed the pull request at: https://github.com/apache/flink/pull/3870
          Hide
          srichter Stefan Richter added a comment -

          fixed in 4745d0c082

          Show
          srichter Stefan Richter added a comment - fixed in 4745d0c082

            People

            • Assignee:
              srichter Stefan Richter
              Reporter:
              srichter Stefan Richter
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development