Details
Description
When reader and writer timezones are different, PPD evaluation does not offset the timezone when reading the min and max values. This can result is wrong PPD evaluation and hence incorrect results.
Example:
Table written in US/Eastern timezone. All values in this table are "2007-08-01 00:00:00.0".
PPD disabled
hive> set hive.optimize.index.filter=false; hive> select ORDER_DATE from ORDER_FACT_small where ORDER_DATE='2007-08-01 00:00:00.0' limit 1; 2007-08-01 00:00:00.0 OK
PPD enabled
set hive.optimize.index.filter=true; select ORDER_DATE from ORDER_FACT_small where ORDER_DATE='2007-08-01 00:00:00.0' limit 1; OK
No rows are returned when PPD is enabled (reader timezone is UTC)
Attachments
Issue Links
- blocks
-
ORC-137 Disable bloomfilter PPD for timestamps for files created before ORC-135
- Closed
- links to