Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.9.0
-
None
Description
I think https://issues.apache.org/jira/browse/SPARK-28008 is a good descriptions of what is happening.
It can cause a situation when schema in the MOR log files is incompatible with the schema produced by RowBasedSchemaProvider, so compactions will stall.
I have a fix which is a bit hacky -> postprocess schema produced by the converter and
1) Make sure unions with null types have those null types at position 0
2) They have default values set to null
I couldn't find a way to do a clean fix as some classes that are problematic are from Hive and called from Spark.