See below for original description and reports. In sum, there is a mismatch between how the C++ library and R interpret data without a timezone, and it turns out that we're not passing the timezone to R if it is set in Arrow C++ anyway.
The C++ library docs say "If a timezone-aware field contains a recognized timezone, its values may be localized to that locale upon display; the values of timezone-naive fields must always be displayed “as is”, with no localization performed on them." But R's print default, as well as the parsing default, is the current time zone: https://stat.ethz.ch/R-manual/R-devel/library/base/html/strptime.html
The C++ library seems to parse timestamp strings that don't have timezone information as if they are UTC, so when you read timezone-naive timestamps from Arrow and print them in R, they are shifted to be localized to the current timezone. If you print timestamp data from Arrow with print(timestamp_var, tz="GMT") it would look as you expect.
On further inspection, the arrow-to-vector code for timestamp doesn't seem to consider time zone information even if it does exist. So we don't have the means currently in R to display timestamp data faithfully, whether or not it is timezone-aware.
Among the tasks here:
- Include the timezone attribute in the POSIXct R vector that gets created from a timestamp Arrow array
- Ensure that timezone-naive data from Arrow is printed in R "as is" with no localization
Hello the dream team,
Pasting from https://github.com/wesm/feather/issues/351
Thanks for this wonderful package. I was playing with feather and some timestamps and I noticed some dangerous behavior. Maybe it is a bug.
Here I create the corresponding `EST` timestamp of my original timestamps (in `UTC` time).
Now saving the dataframe to `csv` or to `feather` will generate two completely different results.
Switching to R.
Using the good old `csv` gives me something a bit annoying, but expected. R thinks my timezone is `UTC` by default, and wrongly attached this timezone to `timestamp_est`. No big deal, I can always use `with_tz` or even better: import as character and process as timestamp while in R.
My timestamps have been converted!!! pure insanity.
Am I missing something here?