[SPARK-27191] union of dataframes depends on order of the columns in 2.4.0 - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Not A Bug
Affects Version/s: 2.4.0
Fix Version/s: 2.3.0
Component/s: SQL
Labels:
None

Description

Thought this issue was resolved in 2.3.0 according to https://issues.apache.org/jira/browse/SPARK-22335 but I still faced this in 2.4.0.

>>> df_1 = spark.createDataFrame([["1aa", "1bbbbbbb"]], ["col1", "col2"])
>>> df_1.show()
+----+--------+
|col1| col2|
+----+--------+
| 1aa|1bbbbbbb|
+----+--------+

>>> df_2 = spark.createDataFrame([["2bbbbbbb", "2aa"]], ["col2", "col1"])
>>> df_2.show()
+--------+----+
| col2|col1|
+--------+----+
|2bbbbbbb| 2aa|
+--------+----+

>>> df_u = df_1.union(df_2)
>>> df_u.show()
+--------+--------+
| col1| col2|
+--------+--------+
| 1aa|1bbbbbbb|
|2bbbbbbb| 2aa|
+--------+--------+

>>> spark.version
'2.4.0'
>>>

Attachments

Issue Links

is related to

SPARK-22335 Union for DataSet uses column order instead of types for union

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Mrinal Kanti Sardar

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 18/Mar/19 13:24

Updated:: 18/Mar/19 16:28

Resolved:: 18/Mar/19 16:28