Uploaded image for project: 'ORC'
  1. ORC
  2. ORC-135

PPD for timestamp is wrong when reader and writer timezones are different

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 1.0.0, 1.1.0, 1.2.0, 1.3.0
    • 1.3.2, 1.4.0
    • None
    • None

    Description

      When reader and writer timezones are different, PPD evaluation does not offset the timezone when reading the min and max values. This can result is wrong PPD evaluation and hence incorrect results.

      Example:
      Table written in US/Eastern timezone. All values in this table are "2007-08-01 00:00:00.0".

      PPD disabled
      hive> set hive.optimize.index.filter=false;
      hive> select ORDER_DATE from ORDER_FACT_small where ORDER_DATE='2007-08-01 00:00:00.0' limit 1;
      2007-08-01 00:00:00.0
      OK
      
      PPD enabled
      set hive.optimize.index.filter=true;
      select ORDER_DATE from ORDER_FACT_small where ORDER_DATE='2007-08-01 00:00:00.0' limit 1;
      OK
      

      No rows are returned when PPD is enabled (reader timezone is UTC)

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            prasanth_j Prasanth Jayachandran
            prasanth_j Prasanth Jayachandran
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment