Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
1.4.1
-
None
-
None
Description
I actually encountered this problem with two dataframes that have 8 and 10 columns each. The below is a made up example that reproduces what I observed going wrong.
Consider the two dataframes:
df1:
---------------+
id | count |
---------------+
---------------+
df2:
----------------------
id | new_count | count |
----------------------
1 | 4 | 6 |
1 | 5 | 6 |
3 | 6 | 6 |
2 | 7 | 6 |
----------------------
The call:
df3 = df1.unionAll(df2)
returns successfully with df3 containing 2 cloumns. However, some columns now have swapped values (with other columns). Based on my previous experience I would say that df3's count column will actually be the new_count column.
I believe that this call should never complete successfully in the first place and should throw an exception instead.
Attachments
Issue Links
- Is contained by
-
SPARK-9813 Incorrect UNION ALL behavior
- Resolved