Details
-
Bug
-
Status: In Progress
-
Major
-
Resolution: Unresolved
-
3.0.2, 3.1.2, 3.2.0
-
None
-
None
Description
After https://github.com/apache/spark/pull/31368 work to simplify hive view resolution
I found a bug because Hive allow you to change the order inside a struct
1) You create a table in hive with a struct:
CREATE table test_struct (id int, sub STRUCT <a :INT, b:STRING>);
2) You insert data into it :
INSERT INTO TABLE test_struct select 1, named_struct("a",1,"b","v1");
3) Create a view on top of it :
CREATE view test_view_struct as select id, sub from test_view_struct
4) Change the table struct reodoring the struct
ALTER TABLE test_struct CHANGE COLUMN sub sub STRUCT < b:STRING,a :INT>;
5) Spark can not anymore query the view because struct in spark it's based on the position not on the name of the column.
If the changement it's castable you can even have a silent failed