Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
When reading a text table with vectorization on and hive.fetch.task.conversion as none, wrong parsing of delimiter is happening in nested complex types containing map. For example, if a columns schema is like: map<string,struct<id:string,name:string> then \u0004 char is coming in the output. Here is a example:
Sample q file:
set hive.fetch.task.conversion=none; set hive.vectorized.execution.enabled=true; create EXTERNAL table `table4` as select 'bob' as name, map( "Map_Key1", named_struct( 'Id', 'Id_Value1', 'Name', 'Name_Value1' ), "Map_Key2", named_struct( 'Id', 'Id_Value2', 'Name', 'Name_Value2' ) ) as testmarks; select * from table4; set hive.vectorized.execution.enabled=false; select * from table4;
Output of 1st select statement:
bob· {"Map_Key1":{"id":"Id_Value1\u0004Name_Value1","name":null},"Map_Key2":{"id":"Id_Value2\u0004Name_Value2","name":null}}
Output of 2nd select statement:
bob· {"Map_Key1":{"id":"Id_Value1","name":"Name_Value1"},"Map_Key2":{"id":"Id_Value2","name":"Name_Value2"}}
MAP Complex type is not handling the scenario where it contains a nested complex type like STRUCT, ARRAY, UNION.
To reproduce this issue:
mvn test -Dtest=TestCliDriver -Pitests -Dqfile=`qfile_name`-pl itests/qtest -Dtest.output.overwrite
Attachments
Issue Links
- links to