Details
-
Task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
4.0.1
Description
Here's the problem I detected while triaging tickets.
This was run locally after merging from apache/arrow at commit 8773b9d and re-building both Arrow library and Arrow R package.
library(arrow) #> See arrow_info() for available features #> #> Attaching package: 'arrow' #> The following object is masked from 'package:utils': #> #> timestamp library(dplyr) #> #> Attaching package: 'dplyr' #> The following objects are masked from 'package:stats': #> #> filter, lag #> The following objects are masked from 'package:base': #> #> intersect, setdiff, setequal, union library(testthat) #> #> Attaching package: 'testthat' #> The following object is masked from 'package:dplyr': #> #> matches #> The following object is masked from 'package:arrow': #> #> matches tstring <- tibble(x = c("08-05-2008", NA)) tstamp <- tibble(x = c(strptime("08-05-2008", format = "%m-%d-%Y"), NA)) expect_equal( tstring %>% Table$create() %>% mutate( x = strptime(x, format = "%m-%d-%Y") ) %>% collect(), tstamp, check.tzone = FALSE ) #> Error: `%>%`(...) not equal to `tstamp`. #> Component "x": Mean absolute difference: 14400
We can see that the dates are different by exact 4 hours by removing the expectation:
library(arrow) #> See arrow_info() for available features #> #> Attaching package: 'arrow' #> The following object is masked from 'package:utils': #> #> timestamp library(dplyr) #> #> Attaching package: 'dplyr' #> The following objects are masked from 'package:stats': #> #> filter, lag #> The following objects are masked from 'package:base': #> #> intersect, setdiff, setequal, union library(testthat) #> #> Attaching package: 'testthat' #> The following object is masked from 'package:dplyr': #> #> matches #> The following object is masked from 'package:arrow': #> #> matches tstring <- tibble(x = c("08-05-2008", NA)) tstamp <- tibble(x = c(strptime("08-05-2008", format = "%m-%d-%Y"), NA)) tstring %>% Table$create() %>% mutate( x = strptime(x, format = "%m-%d-%Y") ) %>% collect() #> # A tibble: 2 x 1 #> x #> <dttm> #> 1 2008-08-04 20:00:00 #> 2 NA tstamp #> # A tibble: 2 x 1 #> x #> <dttm> #> 1 2008-08-05 00:00:00 #> 2 NA
Created on 2021-06-07 by the reprex package (v2.0.0)