Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
Currently, when comparing the values of two timestamp arrays, we ignore whether the types have a timezone set or not. For example:
>>> arr1 = pa.array([1], pa.timestamp("s")) >>> arr2 = pa.array([1], pa.timestamp("s", tz="UTC")) >>> pc.equal(arr1, arr2) <pyarrow.lib.BooleanArray object at 0x7ff9ec897f40> [ true ] >>> pc.greater(arr1, arr2) <pyarrow.lib.BooleanArray object at 0x7f3fbc09bd60> [ false ]
While absence of a timezone for the tz-naive/local timestamp, we can't compare both values. So I think we should rather raise an error in this case.
(comparing timestamps with timezones but with a different timezone should be fine, since it's comparing the underlying UTC value which will give the correct result)
—
Note: this will probably depend on the outcome of the discussion on the mailing list about the interpretation of timezone-less timestamps: https://lists.apache.org/thread.html/r8216e5de3efd2935e3907ad9bd20ce07e430952f84de69b36337e5eb%40%3Cdev.arrow.apache.org%3E
Attachments
Issue Links
- links to