Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7142

Hive multi serialization encoding support

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.14.0
    • None

    Description

      Currently Hive only support serialize data into UTF-8 charset bytes or deserialize from UTF-8 bytes, real world users may want to load different kinds of encoded data into hive directly. This jira is dedicated to support serialize/deserialize all kinds of encoded data in SerDe layer.

      For user, only need to configure serialization encoding on table level by set serialization encoding through serde parameter, for example:

      CREATE TABLE person(id INT, name STRING, desc STRING)ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES("serialization.encoding"='GBK');
      

      or

      ALTER TABLE person SET SERDEPROPERTIES ('serialization.encoding'='GBK'); 
      

      LIMITATIONS: Only LazySimpleSerDe support "serialization.encoding" property in this patch.

      Attachments

        1. HIVE-7142.4.patch
          12 kB
          Chengxiang Li
        2. HIVE-7142.3.patch
          8 kB
          Chengxiang Li
        3. HIVE-7142.2.patch
          8 kB
          Chengxiang Li
        4. HIVE-7142.1.patch.txt
          8 kB
          Chengxiang Li

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            chengxiang li Chengxiang Li Assign to me
            chengxiang li Chengxiang Li
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment