Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-33828 SQL Adaptive Query Execution QA
  3. SPARK-35239

Coalesce shuffle partition should handle empty input RDD

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 3.2.0
    • Fix Version/s: 3.2.0
    • Component/s: SQL
    • Labels:
      None

      Description

      If input RDD partition is empty then the map output statistics will be null. And if all shuffle stage's input RDD partition is empty, we will skip it and lose the chance to coalesce partition.

       

      We can simply create a empty partition for these custom shuffle reader to reduce the partition number.

        Attachments

          Activity

            People

            • Assignee:
              ulysses XiDuo You
              Reporter:
              ulysses XiDuo You
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: