Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
1.1.0
-
None
Description
The issue is Drill fails to read "timestamp" type in parquet file generated by Hive.
How to reproduce:
1. Create a external Hive CSV table in hive 1.0:
create external table type_test_csv ( id1 int, id2 string, id3 timestamp, id4 double ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE LOCATION '/xxx/testcsv';
2. Put sample data for above external table:
1,One,2015-01-01 00:01:00,1.0 2,Two,2015-01-02 00:02:00,2.0
3. Create a parquet hive table:
create external table type_test ( id1 int, id2 string, id3 timestamp, id4 double ) STORED AS PARQUET LOCATION '/xxx/type_test'; INSERT OVERWRITE TABLE type_test SELECT * FROM type_test_csv;
4. Then querying the parquet file directly through filesystem storage plugin:
> select * from dfs.`xxx/type_test`; Error: SYSTEM ERROR: UnsupportedOperationException: Unable to get value vector class for minor type [FIXEDBINARY] and mode [OPTIONAL] Fragment 0:0 [Error Id: fccfe8b2-6427-46e5-8bfd-cac639e526e8 on h3.poc.com:31010] (state=,code=0)
5. If the sample data is only 1 row:
1,One,2015-01-01 00:01:00,1.0
Then the error message would become:
> select * from dfs.`xxx/type_test`; Error: SYSTEM ERROR: UnsupportedOperationException: Unsupported type:INT96 [Error Id: b52b5d46-63a8-4be6-a11d-999a1b46c7c2 on h3.poc.com:31010] (state=,code=0)
Using Hive storage plugin works fine. This issue only applies to filesystem storage plugin.
Attachments
Issue Links
- relates to
-
DRILL-2908 Support reading the Parquet int 96 type
- Closed