Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-14867

"serialization.last.column.takes.rest" does not work for MultiDelimitSerDe

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.3.0
    • None
    • None

    Description

      Create table with MultiDelimitSerde:

      CREATE TABLE foo (a string, b string) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES ("field.delim"="|@|","collection.delim"=":","mapkey.delim"="@") stored as textfile;
      

      load data into table:

      1|@|Lily|@|HW|@|abc
      2|@|Lucy|@|LX|@|123
      3|@|Lilei|@|XX|@|3434
      

      select data from this table:

      select * from foo;
      +---------+----------------+--+
      | foo.a  |     foo.b     |
      +---------+----------------+--+
      | 1       | Lily^AHW^Aabc    |
      | 2       | Lucy^ALX^A123    |
      | 3       | Lilei^AXX^A3434  |
      +---------+----------------+--+
      3 rows selected (0.905 seconds)
      

      You can see the last column takes all the data, and replace the delimiter to default ^A.

      lastColumnTakesRestString should be false by default:

          String lastColumnTakesRestString = tbl
              .getProperty(serdeConstants.SERIALIZATION_LAST_COLUMN_TAKES_REST);
          lastColumnTakesRest = (lastColumnTakesRestString != null && lastColumnTakesRestString
              .equalsIgnoreCase("true"));
      

       

      Attachments

        Activity

          People

            niklaus.xiao Niklaus Xiao
            niklaus.xiao Niklaus Xiao
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: