Uploaded image for project: 'ORC'
  1. ORC
  2. ORC-135

PPD for timestamp is wrong when reader and writer timezones are different

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 1.0.0, 1.1.0, 1.2.0, 1.3.0
    • 1.3.2, 1.4.0
    • None
    • None

    Description

      When reader and writer timezones are different, PPD evaluation does not offset the timezone when reading the min and max values. This can result is wrong PPD evaluation and hence incorrect results.

      Example:
      Table written in US/Eastern timezone. All values in this table are "2007-08-01 00:00:00.0".

      PPD disabled
      hive> set hive.optimize.index.filter=false;
      hive> select ORDER_DATE from ORDER_FACT_small where ORDER_DATE='2007-08-01 00:00:00.0' limit 1;
      2007-08-01 00:00:00.0
      OK
      
      PPD enabled
      set hive.optimize.index.filter=true;
      select ORDER_DATE from ORDER_FACT_small where ORDER_DATE='2007-08-01 00:00:00.0' limit 1;
      OK
      

      No rows are returned when PPD is enabled (reader timezone is UTC)

      Attachments

        Issue Links

          Activity

            People

              prasanth_j Prasanth Jayachandran
              prasanth_j Prasanth Jayachandran
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: