Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.0.0
-
None
Description
If Spark lunched multiple executors in one host for one application, every executor would download it dependent files and jars (if not using local: url) independently. It maybe result in huge latency. In my case, it result in 20 seconds latency to download dependent jars(about 17M) when I lunch 32 executors in one host(total 4 hosts).
This patch will cache downloaded files and jars for executors to reduce network throughput and download latency. I my case, the latency was reduced from 20 seconds to less than 1 second.
Attachments
Issue Links
- is related to
-
SPARK-662 Executor should only download files & jars once
- Resolved
- relates to
-
SPARK-6619 Improve Jar caching on executors
- Resolved
- links to