Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.8.0, 1.9.0, 1.10.0
-
None
-
None
Description
In drill-1.10, setting `store.parquet.reader.int96_as_timestamp`=true returns expected data:
select voter_id,create_timestamp from dfs.`/user/hive/warehouse/voter_hive_parquet` limit 15;
---------------------------------+
voter_id | create_timestamp |
---------------------------------+
1 | 2016-10-23 20:03:58.0 |
2 | null |
3 | 2016-09-09 12:01:18.0 |
4 | 2017-03-06 20:35:55.0 |
5 | 2017-01-20 22:32:43.0 |
6 | 2016-10-22 05:46:12.0 |
7 | 2016-09-19 10:21:29.0 |
8 | null |
9 | 2016-07-23 13:39:02.0 |
10 | 2017-01-28 17:27:19.0 |
11 | 2016-10-23 10:55:44.0 |
12 | 2016-06-07 22:44:03.0 |
13 | 2016-05-04 13:59:20.0 |
14 | 2016-11-08 17:20:14.0 |
15 | 2016-05-14 11:23:53.0 |
---------------------------------+
However, setting `store.parquet.reader.int96_as_timestamp`=false returns incorrect timestamp when it encounters the second "null" value.
select voter_id,convert_from(create_timestamp, 'TIMESTAMP_IMPALA') from dfs.`/user/hive/warehouse/voter_hive_parquet` limit 15;
---------------------------------+
voter_id | EXPR$1 |
---------------------------------+
1 | 2016-10-23 20:03:58.0 |
2 | null |
3 | 2016-09-09 12:01:18.0 |
4 | 2017-03-06 20:35:55.0 |
5 | 2017-01-20 22:32:43.0 |
6 | 2016-10-22 05:46:12.0 |
7 | 2016-09-19 10:21:29.0 |
8 | 2016-07-23 13:39:02.0 |
9 | 2016-10-23 10:55:44.0 |
10 | 2016-06-07 22:44:03.0 |
11 | 2016-05-04 13:59:20.0 |
12 | 2016-11-08 17:20:14.0 |
13 | 2016-05-14 11:23:53.0 |
14 | 2016-06-20 16:18:51.0 |
15 | 2016-09-09 10:02:28.0 |
---------------------------------+
Notice that the timestamp for voter_id=9 shifts to voter_id=8 which suppose to have value of "null". The rest of the timestamps after voter_id=7 are incorrect. This issue is also reproducible on both drill-1.8.0 and drill-1.9.0.