Description
Thought this issue was resolved in 2.3.0 according to https://issues.apache.org/jira/browse/SPARK-22335 but I still faced this in 2.4.0.
>>> df_1 = spark.createDataFrame([["1aa", "1bbbbbbb"]], ["col1", "col2"]) >>> df_1.show() +----+--------+ |col1| col2| +----+--------+ | 1aa|1bbbbbbb| +----+--------+ >>> df_2 = spark.createDataFrame([["2bbbbbbb", "2aa"]], ["col2", "col1"]) >>> df_2.show() +--------+----+ | col2|col1| +--------+----+ |2bbbbbbb| 2aa| +--------+----+ >>> df_u = df_1.union(df_2) >>> df_u.show() +--------+--------+ | col1| col2| +--------+--------+ | 1aa|1bbbbbbb| |2bbbbbbb| 2aa| +--------+--------+ >>> spark.version '2.4.0' >>>
Attachments
Issue Links
- is related to
-
SPARK-22335 Union for DataSet uses column order instead of types for union
- Resolved