Currently, unionByName workes with two DataFrames with slightly different schemas. It would be good it works with an array of struct columns.
unionByName fails if we try to merge dataframe with an array of struct columns with slightly different schema
Below is the example.
Step 1: dataframe arrayStructDf1 with columnbooksIntersted of type array of struct
Step 2: Another dataframe arrayStructDf2 with column booksIntersted of type array of a struct but struct contains an extra field called "new_column"
Step3: Merge arrayStructDf1 and arrayStructDf2 using unionByName
We see the error org.apache.spark.sql.AnalysisException: Union can only be performed on tables with the compatible column types.
unionByName should fill the missing data with null like it does column with struct type