Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-10212

REST API for listing all the available save points

    XMLWordPrintableJSON

Details

    Description

      Background

      I'm one of the authors of the open-source Flink job deployer (https://github.com/ing-bank/flink-deployer). Recently, I rewrote our implementation to use the Flink REST API instead of the native CLI. 

      In our use case, we store the job savepoints in a Kubernetes persistent volume. For our deployer, we mount the persistent volume to our deployer container so that we can find and use the savepoints. 

      In the rewrite to the REST API, I saw that the API to monitor savepoint creation returns the complete path to the created savepoint, and we can use this one in the job deployer to start the new job with the latest save point.

      However, we also allow users to deploy a job with a recovered state by specifying only the directory savepoints are stored in. In this scenario we will look for the latest savepoint created for this job ourselves inside the given directory. To find this path, we're still relying on the mounted volume and listing directory content to discover savepoints.

      Feature

      I was thinking that it might be a good addition if the native Flink REST API offers the ability to retrieve savepoints. Seeing that the API doesn't inherently know where savepoints are stored, it could take a path as one of the arguments. It could even allow the user to provide a job ID as an argument so that the API would be able to search for savepoints for a specific job ID in the specified directory. 

      As the API would require the path as an argument, and providing a path containing forward slashes in the URL isn't ideal, I'm eager to discuss what a proper solution would look like.

      A POST request to /jobs/:jobid/savepoints with the path as a body parameter would make sense if the API were to offer to list all save points in a specific path but this request is already being used for creating new savepoints.

      An alternative could be a POST to /savepoints with the path and job ID in the request body.

      A POST request to retrieve data is obviously not the most straightforward approach but in my opinion still preferable over a GET to, for example, /jobs/:jobid/savepoints/:targetDirectory

      I'm willing to help out on this one by submitting a pull request.

      Looking forward to your thoughts! 

      Attachments

        Activity

          People

            Unassigned Unassigned
            mrooding Marc Rooding
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: