Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-5871

Use multiple-characters as field delimiter

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.12.0
    • 0.14.0
    • Contrib

    Description

      By default, hive only allows user to use single character as field delimiter. Although there's RegexSerDe to specify multiple-character delimiter, it can be daunting to use, especially for amateurs.
      The patch adds a new SerDe named MultiDelimitSerDe. With MultiDelimitSerDe, users can specify a multiple-character field delimiter when creating tables, in a way most similar to typical table creations. For example:

      create table test (id string,hivearray array<binary>,hivemap map<string,int>) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES ("field.delim"="[,]","collection.delim"=":","mapkey.delim"="@");
      

      where field.delim is the field delimiter, collection.delim and mapkey.delim is the delimiter for collection items and key value pairs, respectively. Among these delimiters, field.delim is mandatory and can be of multiple characters, while collection.delim and mapkey.delim is optional and only support single character.

      To use MultiDelimitSerDe, you have to add the hive-contrib jar to the class path, e.g. with the add jar command.

      Attachments

        1. HIVE-5871.2.patch
          16 kB
          Rui Li
        2. HIVE-5871.3.patch
          16 kB
          Rui Li
        3. HIVE-5871.4.patch
          16 kB
          Rui Li
        4. HIVE-5871.5.patch
          17 kB
          Rui Li
        5. HIVE-5871.6.patch
          17 kB
          Rui Li
        6. HIVE-5871.patch
          16 kB
          Rui Li

        Issue Links

          Activity

            People

              lirui Rui Li
              lirui Rui Li
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: