Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-15079

Hive cannot read Parquet string timetamps as TIMESTAMP data type

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Hive
    • None

    Description

      The Hive Wiki for timestamps specifies that strings timestamps can be read by Hive.

      https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-Timestamps

      Supported conversions:
      Integer numeric types: Interpreted as UNIX timestamp in seconds
      Floating point numeric types: Interpreted as UNIX timestamp in seconds with decimal precision
      Strings: JDBC compliant java.sql.Timestamp format "YYYY-MM-DD HH:MM:SS.fffffffff" (9 decimal place precision)
      

      This works fine with Text table formats, but when Parquet is used, then it throws the following exception:

      java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.hive.serde2.io.TimestampWritable
      

      How to reproduce

      > create table t1 (id int, time string) stored as parquet;
      > insert into table t1 values (1,'2016-07-17 14:42:18');
      > alter table t1 replace columns (id int, time timestamp);
      > select * from t1
      

      The above example will run fine if you use a TEXT format instead of PARQUET.

      This issue was raised on PARQUET-723

      Attachments

        Activity

          People

            Unassigned Unassigned
            spena Sergio Peña
            Votes:
            0 Vote for this issue
            Watchers:
            13 Start watching this issue

            Dates

              Created:
              Updated: