Description
InsertIntoHadoopFsRelation implements Hive compatible dynamic partitioning insertion, which uses String.valueOf to write encode partition column values into dynamic partition directories. This actually limits the data types that can be used in partition column. For example, string representation of StructType values is not well defined. However, this limitation is not explicitly enforced.
There are several things we can improve:
- Enforce dynamic column data type requirements by adding analysis rules and throws AnalysisException when violation occurs.
- Abstract away string representation of various data types, so that we don't need to convert internal representation types (e.g. UTF8String) to external types (e.g. String). A set of Hive compatible implementations should be provided to ensure compatibility with Hive.