[SPARK-46105] df.emptyDataFrame shows 1 if we repartition(1) in Spark 3.3.x and above - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 3.3.3
Fix Version/s: None
Component/s: Spark Core
Labels:
None
Environment:

EKS

EMR

Language:
- scala

Description

Version: 3.3.3

scala> val df = spark.emptyDataFrame
df: org.apache.spark.sql.DataFrame = []

scala> df.rdd.getNumPartitions
res0: Int = 0

scala> df.repartition(1).rdd.getNumPartitions
res1: Int = 1

scala> df.repartition(1).rdd.isEmpty()
[Stage 1:> (0 + 1) / res2: Boolean = true

Version: 3.2.4

scala> val df = spark.emptyDataFrame
df: org.apache.spark.sql.DataFrame = []

scala> df.rdd.getNumPartitions
res0: Int = 0

scala> df.repartition(1).rdd.getNumPartitions
res1: Int = 0

scala> df.repartition(1).rdd.isEmpty()
res2: Boolean = true

Version: 3.5.0

scala> val df = spark.emptyDataFrame
df: org.apache.spark.sql.DataFrame = []

scala> df.rdd.getNumPartitions
res0: Int = 0

scala> df.repartition(1).rdd.getNumPartitions
res1: Int = 1

scala> df.repartition(1).rdd.isEmpty()
[Stage 1:> (0 + 1) / res2: Boolean = true

When we do repartition of 1 on an empty dataframe, the resultant partition is 1 in version 3.3.x and 3.5.x whereas when I do the same in version 3.2.x, the resultant partition is 0. May i know why this behaviour is changed from 3.2.x to higher versions.

The reason for raising this as a bug is I have a scenario where my final dataframe returns 0 records in EKS(local spark) with single node(driver and executor on the sam node) but it returns 1 in EMR both uses a same spark version 3.3.3. I'm not sure why this behaves different in both the environments. As a interim solution, I had to repartition a empty dataframe if my final dataframe is empty which returns 1 for 3.3.3. Would like to know if this really a bug or this behaviour exists in the future versions and cannot be changed?

Because, If we go for a spark upgrade and this behaviour is changed, we will face the issue again.

Please confirm on this.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

Screenshot 2023-11-26 at 11.54.58 AM.png
26/Nov/23 06:25
680 kB
dharani_sugumar

Activity

People

Assignee:: Unassigned

Reporter:: dharani_sugumar

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 26/Nov/23 06:23

Updated:: 28/Nov/23 20:29