Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17574

Cache ShuffleExchange RDD when the exchange is reused

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Won't Fix
    • None
    • None
    • Spark Core
    • None

    Description

      We have the rule ReuseExchange to reuse exchange in the physical plan. However, for a ShuffleExchange we still need to retrieve remote blocks again when the shuffle exchange is reused. We can cache the RDD of the reused ShuffleExchange to avoid transferring remote blocks again.

      Attachments

        Activity

          People

            Unassigned Unassigned
            viirya L. C. Hsieh
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: