Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
1.1.0
-
None
-
None
Description
Currently, the Spark uses "java.io.tmpdir" to find the /tmp/ directory.
Then, the /tmp/ directory is used to
1. Setup the HTTP File Server
2. Broadcast directory
3. Fetch Dependency files or jars by Executors
The size of the /tmp/ directory will keep growing. The free space of the system disk will be less.
I think we could add a configuration "spark.tmp.dir" in conf/spark-env.sh or conf/spark-defaults.conf to set this particular directory. Let's say, set the directory to a data disk.
If "spark.tmp.dir" is not set, use the default "java.io.tmpdir"
Attachments
Issue Links
- links to