Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-6519

Integrate BlobStore in HighAvailabilityServices lifecycle management

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.3.0, 1.4.0
    • Fix Version/s: 1.3.0, 1.4.0
    • Labels:
      None

      Description

      In order to properly trigger the clean up of the BlobStore content, it should be created and managed by the HighAvailabilityServices. That way we can trigger the content cleanup when HighAvailabilityServices#closeAndCleanupData is called.

        Issue Links

          Activity

          Hide
          till.rohrmann Till Rohrmann added a comment -

          1.4.0: 88b0f2ac3fd788932d7f434ca57ba3718c3fa621
          1.3.0: e3ea89a9fab39e7595c466bcd90c30c338c86f4e

          Show
          till.rohrmann Till Rohrmann added a comment - 1.4.0: 88b0f2ac3fd788932d7f434ca57ba3718c3fa621 1.3.0: e3ea89a9fab39e7595c466bcd90c30c338c86f4e
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/flink/pull/3864

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/3864
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user tillrohrmann opened a pull request:

          https://github.com/apache/flink/pull/3864

          FLINK-6519 Integrate BlobStore in lifecycle management of HighAvailabilityServices

          This PR is based on #3512.

          The `HighAvailabilityServices` create a single `BlobStoreService` instance which is
          shared by all `BlobServer` and `BlobCache` instances. The `BlobStoreService's` lifecycle
          is exclusively managed by the `HighAvailabilityServices`. This means that the
          `BlobStore's` content is only cleaned up if the `HighAvailabilityServices'` HA data
          is cleaned up. Having this single point of control, makes it easier to decide when
          to discard HA data (e.g. in case of a successful job execution) and when to retain
          the data (e.g. for recovery).

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/tillrohrmann/flink blobStoreLifecycle

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/flink/pull/3864.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #3864


          commit 8f4f75fec4af1f941cdd10ad585c49f198a19cea
          Author: Nico Kruber <nico@data-artisans.com>
          Date: 2017-01-06T17:42:58Z

          FLINK-6008 Improve BlobService implementation

          FLINK-6008[docs] minor improvements in the BlobService docs

          FLINK-6008 use Preconditions.checkArgument in BlobClient

          FLINK-6008 refactor BlobCache#getURL() for cleaner code

          FLINK-6008 promote BlobStore#deleteAll(JobID) to the BlobService

          FLINK-6008 extend the BlobService to the NAME_ADDRESSABLE blobs

          These blobs are referenced by the job ID and a selected name instead of the
          hash sum of the blob's contents. Some code was already prepared but lacked
          the proper additions in further APIs. This commit adds some.

          FLINK-6008 properly remove NAME_ADDRESSABLE blobs after job/task termination

          FLINK-6008 more unit tests for NAME_ADDRESSABLE and BlobService access

          NAME_ADDRESSABLE blobs were not that thouroughly tested before and also the
          access methods that the BlobService implementations provide. This adds tests
          covering both.

          FLINK-6008 do not fail the BlobServer when delete fails

          This also enables us to reuse some more code between BlobServerConnection and
          BlobServer.

          FLINK-6008 refactor BlobCache#deleteGlobal() for cleaner code

          FLINK-6008 fix concurrent job directory creation

          also add according unit tests

          FLINK-6008 address some of the PR comments by @StephanEwen

          FLINK-6008 some comments about BlobLibraryCacheManager cleanup

          [hotfix] minor typos

          FLINK-6008 add retrieval and proper cleanup of name-addressable blobs at the BlobLibraryCacheManager

          FLINK-6008 further cleanup tests for BlobLibraryCacheManager

          FLINK-6008 remove the exposal of the undelying blob service in LibraryCacheManager

          This may actually change in future.

          commit ae2bea34ec6e00cb43bd6868dcfd577b96006565
          Author: Till Rohrmann <trohrmann@apache.org>
          Date: 2017-05-09T08:26:37Z

          FLINK-6519 Integrate BlobStore in lifecycle management of HighAvailabilityServices

          The HighAvailabilityService creates a single BlobStoreService instance which is
          shared by all BlobServer and BlobCache instances. The BlobStoreService's lifecycle
          is exclusively managed by the HighAvailabilityServices. This means that the
          BlobStore's content is only cleaned up if the HighAvailabilityService's HA data
          is cleaned up. Having this single point of control, makes it easier to decide when
          to discard HA data (e.g. in case of a successful job execution) and when to retain
          the data (e.g. for recovery).

          Close and cleanup all data of BlobStore in HighAvailabilityServices

          Use HighAvailabilityServices to create BlobStore

          Introduce BlobStoreService interface to hide close and closeAndCleanupAllData methods


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user tillrohrmann opened a pull request: https://github.com/apache/flink/pull/3864 FLINK-6519 Integrate BlobStore in lifecycle management of HighAvailabilityServices This PR is based on #3512. The `HighAvailabilityServices` create a single `BlobStoreService` instance which is shared by all `BlobServer` and `BlobCache` instances. The `BlobStoreService's` lifecycle is exclusively managed by the `HighAvailabilityServices`. This means that the `BlobStore's` content is only cleaned up if the `HighAvailabilityServices'` HA data is cleaned up. Having this single point of control, makes it easier to decide when to discard HA data (e.g. in case of a successful job execution) and when to retain the data (e.g. for recovery). You can merge this pull request into a Git repository by running: $ git pull https://github.com/tillrohrmann/flink blobStoreLifecycle Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3864.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3864 commit 8f4f75fec4af1f941cdd10ad585c49f198a19cea Author: Nico Kruber <nico@data-artisans.com> Date: 2017-01-06T17:42:58Z FLINK-6008 Improve BlobService implementation FLINK-6008 [docs] minor improvements in the BlobService docs FLINK-6008 use Preconditions.checkArgument in BlobClient FLINK-6008 refactor BlobCache#getURL() for cleaner code FLINK-6008 promote BlobStore#deleteAll(JobID) to the BlobService FLINK-6008 extend the BlobService to the NAME_ADDRESSABLE blobs These blobs are referenced by the job ID and a selected name instead of the hash sum of the blob's contents. Some code was already prepared but lacked the proper additions in further APIs. This commit adds some. FLINK-6008 properly remove NAME_ADDRESSABLE blobs after job/task termination FLINK-6008 more unit tests for NAME_ADDRESSABLE and BlobService access NAME_ADDRESSABLE blobs were not that thouroughly tested before and also the access methods that the BlobService implementations provide. This adds tests covering both. FLINK-6008 do not fail the BlobServer when delete fails This also enables us to reuse some more code between BlobServerConnection and BlobServer. FLINK-6008 refactor BlobCache#deleteGlobal() for cleaner code FLINK-6008 fix concurrent job directory creation also add according unit tests FLINK-6008 address some of the PR comments by @StephanEwen FLINK-6008 some comments about BlobLibraryCacheManager cleanup [hotfix] minor typos FLINK-6008 add retrieval and proper cleanup of name-addressable blobs at the BlobLibraryCacheManager FLINK-6008 further cleanup tests for BlobLibraryCacheManager FLINK-6008 remove the exposal of the undelying blob service in LibraryCacheManager This may actually change in future. commit ae2bea34ec6e00cb43bd6868dcfd577b96006565 Author: Till Rohrmann <trohrmann@apache.org> Date: 2017-05-09T08:26:37Z FLINK-6519 Integrate BlobStore in lifecycle management of HighAvailabilityServices The HighAvailabilityService creates a single BlobStoreService instance which is shared by all BlobServer and BlobCache instances. The BlobStoreService's lifecycle is exclusively managed by the HighAvailabilityServices. This means that the BlobStore's content is only cleaned up if the HighAvailabilityService's HA data is cleaned up. Having this single point of control, makes it easier to decide when to discard HA data (e.g. in case of a successful job execution) and when to retain the data (e.g. for recovery). Close and cleanup all data of BlobStore in HighAvailabilityServices Use HighAvailabilityServices to create BlobStore Introduce BlobStoreService interface to hide close and closeAndCleanupAllData methods

            People

            • Assignee:
              till.rohrmann Till Rohrmann
              Reporter:
              till.rohrmann Till Rohrmann
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development