Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-24122

Add support to do clean in history server

    XMLWordPrintableJSON

Details

    Description

      Now, the history server can clean history jobs by two means:

      1. if users have configured 
        historyserver.archive.clean-expired-jobs: true

        , then compare the files in hdfs over two clean interval and find the delete and clean the local cache file.

      1. if users have configured the 
        historyserver.archive.retained-jobs:

        a positive number, then clean the oldest files in hdfs and local.

      But the retained-jobs number is difficult to determine.

      For example, users may want to check the history jobs yesterday while many jobs failed today and exceed the retained-jobs number, then the history jobs of yesterday will be delete. So what if add a configuration which contain a retained-times that indicate the max time the history job retain?

      Also it can't clean the job history files which was no longer in hdfs but still cached in local filesystem and these files will store forever and can't be cleaned unless users manually do this. Maybe we can give a option and do this clean if the option says true.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              zlzhang0122 zlzhang0122
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: