JsonSerDe is too strict when it comes to schema, erroring out if it finds a subfield with a key name that does not map to an appropriate type/schema of a table, or an inner-struct schema.
Thus, if a schema specifies "s:struct<a:int,b:string>,k:int" and we pass it data that looks like the following:
This should still pass, and the record should be read as if it were
This will allow the JsonSerDe to be used with a wider set of data where the data does not map too finely to the declared table schema.
Note, we are still strict about a couple of things:
a) If there is a declared schema column, then the type cannot vary, that is still considered an error. i.e., if the hive table schema says k1 is a boolean, it cannot magically change into an int or a struct, say, for eg.
b) The JsonSerDe still attempts to map hive internal column names - i.e. if the data contains a column named "_col2", then, if "_col2" is not declared directly in the schema, it will map to column position 2 in that schema/subschema, rather than ignoring the field. This is so that tables created with CTAS will still work.
|Status||Patch Available [ 10002 ]||Resolved [ 5 ]|
|Fix Version/s||0.13.0 [ 12324986 ]|
|Resolution||Fixed [ 1 ]|