when I create table with ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files with chinese encoded by GBK:
create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr string,
num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string )
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe'
WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK');
load data local inpath '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-00000' overwrite into table PersonInfo;
I found chinese disorder code in the table and 'serialization.encoding' does not work, the chinese disorder data list as below:
���� 99999999�ϴ����������� 0624624002��ʱ���������� |