Uploaded image for project: 'Kylin'
  1. Kylin
  2. KYLIN-5636

automatically clean up dependent files after the build task

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 5.0-alpha
    • 5.0-beta
    • Tools, Build and Test
    • None

    Description

      question:
      The files uploaded under the path spark.kubernetes.file.upload.path are not automatically deleted
      1: When spark creates a driverPod, it uploads dependencies to the specified path. The build task is in cluster mode and needs to create a driverPod. Running the build task multiple times results in a large path file.
      2: At present, the upload.path path we configured (s3a://kylin/spark-on-k8s) is a fixed path, and spark will create a subdirectory in this directory, the spark-upload-uuid directory, and then store the dependencies in it.
      dev design
      Core idea, add dynamic subdirectory under the original upload.path path, delete the entire subdirectory when the task is over
      Build task: upload.path + jobId (e.g. s3a://kylin/spark-on-k8s/uuid)
      Delete the dependency directory when the build task is finished
       
      Automatically delete dependent function is called, kill-9 situation will lead to the deletion function is not called, garbage cleaning function needs to be added to the bottom of the policy, such as greater than three months before the directory is automatically deleted

      Attachments

        Activity

          People

            guozhiting Zhiting Guo
            guozhiting Zhiting Guo
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: