Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-28294

Support `spark.history.fs.cleaner.maxNum` configuration

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0
    • 3.0.0
    • Spark Core
    • None

    Description

      Up to now, Apache Spark maintains the event log directory by time policy, `spark.history.fs.cleaner.maxAge`. However, there are two issues.

      1. Some file system has a limitation on the maximum number of files in a single directory. For example, HDFS `dfs.namenode.fs-limits.max-directory-items` is 1024 * 1024 by default.

      2. Spark is sometimes unable to to clean up some old log files due to permission issues.

      To handle both (1) and (2), this issue aims to support an additional number policy configuration for the event log directory, `spark.history.fs.cleaner.maxNum`. Spark can try to keep the number of files in the event log directory according to this policy.

      Attachments

        Issue Links

          Activity

            People

              dongjoon Dongjoon Hyun
              dongjoon Dongjoon Hyun
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: