Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2713

Executors of same application in same host should only download files & jars once

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.0.0
    • 1.2.0
    • Spark Core
    • None

    Description

      If Spark lunched multiple executors in one host for one application, every executor would download it dependent files and jars (if not using local: url) independently. It maybe result in huge latency. In my case, it result in 20 seconds latency to download dependent jars(about 17M) when I lunch 32 executors in one host(total 4 hosts).

      This patch will cache downloaded files and jars for executors to reduce network throughput and download latency. I my case, the latency was reduced from 20 seconds to less than 1 second.

      Attachments

        Issue Links

          Activity

            People

              li-zhihui Zhihui
              li-zhihui Zhihui
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: