Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-48261

RoundRobin based coalesce in spark

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Minor
    • Resolution: Abandoned
    • 3.5.1
    • None
    • Spark Core
    • None

    Description

      Currently default coalsce does not take partition size into account and simply merges partitions. This often results in non-uniform data distribution. There have been proposal for size based coalesce(https://github.com/apache/spark/pull/27248).

      I am proposing a custom roundrobin coalesce which will distribute data evenly across partitions within same executor.

      Attachments

        Activity

          People

            Unassigned Unassigned
            subham_singhal Subham Singhal
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: