Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Currently Hive only support serialize data into UTF-8 charset bytes or deserialize from UTF-8 bytes, real world users may want to load different kinds of encoded data into hive directly. This jira is dedicated to support serialize/deserialize all kinds of encoded data in SerDe layer.
For user, only need to configure serialization encoding on table level by set serialization encoding through serde parameter, for example:
CREATE TABLE person(id INT, name STRING, desc STRING)ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES("serialization.encoding"='GBK');
or
ALTER TABLE person SET SERDEPROPERTIES ('serialization.encoding'='GBK');
LIMITATIONS: Only LazySimpleSerDe support "serialization.encoding" property in this patch.
Attachments
Attachments
Issue Links
- is related to
-
HIVE-15826 Add 'serialization.encoding' To All SerDes
- Resolved
- links to