Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Cannot Reproduce
-
2.2.1
-
None
-
None
Description
When I have two DF's that are different only in terms of metadata in fields inside a struct - I cannot union them but the error message shows that they are the same:
df = spark.createDataFrame([{'a':1}]) a = df.select(struct('a').alias('x')) b = df.select(col('a').alias('a',metadata={'description':'xxx'})).select(struct(col('a')).alias('x')) a.union(b).printSchema()
gives:
An error occurred while calling o1076.union.
: org.apache.spark.sql.AnalysisException: Union can only be performed on tables with the compatible column types. struct<a:bigint> <> struct<a:bigint> at the first column of the second table
and this part:
struct<a:bigint> <> struct<a:bigint>
does not make any sense because those are the same.
Since metadata must be the same for union -> it should be incuded in the error message