Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-12653

The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.2.1
    • Fix Version/s: 2.1.0
    • Component/s: Contrib
    • Labels:
      None
    • Release Note:
      add 'serialization.encoding' and suport GBK charset for the class 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' ,please test it.

      Description

      when I create table with ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files with chinese encoded by GBK:
      create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr string,
      num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string )
      ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe'
      WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK');

      load data local inpath '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-00000' overwrite into table PersonInfo;

      I found chinese disorder code in the table and 'serialization.encoding' does not work, the chinese disorder data list as below:

      ���� 99999999�ϴ����������� 0624624002��ʱ����������

        Attachments

        1. HIVE-12653.2.patch
          3 kB
          yangfang
        2. HIVE-12653.3.patch
          4 kB
          yangfang
        3. HIVE-12653.patch
          3 kB
          yangfang
        4. HIVE-12653.patch
          3 kB
          yangfang

          Activity

            People

            • Assignee:
              yangfang yangfang
              Reporter:
              yangfang yangfang
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Due:
                Created:
                Updated:
                Resolved: