Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
6.0.1
-
None
-
x86_64-pc-linux-gnu (64-bit) via rocker/docker rocker/r-base:4.1.2
Description
Hello,
I would like to report a possible issue (or I did not grasp the documentation and I apologize in advance)
Im trying to use R with arrow on docker in order to read parquet files from s3:
FROM rocker/r-base:4.1.2 # TO READ FROM S3 RUN apt update -qq \ && apt install -t unstable -y --no-install-recommends \ libcurl4-openssl-dev ENV LIBARROW_MINIMAL false RUN apt update && \ apt install -y -V ca-certificates lsb-release wget && \ wget "https://apache.jfrog.io/artifactory/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/apache-arrow- apt-source-latest-$(lsb_release --codename --short).deb" && \ apt install -y -V ./apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb RUN apt update && \ apt install -y -V -f \ libarrow-dev \ libarrow-dataset-dev \ libarrow-glib-dev \ libarrow-flight-dev \ libparquet-dev \ libparquet-glib-dev RUN install2.r --error \ arrow
Thats the output of sessionInfo from the container running R
sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 11 (bullseye)Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.18.solocale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8
[9] LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8attached base packages:
[1] stats graphics grDevices utils datasets methods base other attached packages:
[1] arrow_6.0.1 DBI_1.1.1 loaded via a namespace (and not attached):
[1] tidyselect_1.1.1 bit_4.0.4 compiler_4.1.2 magrittr_2.0.1
[5] assertthat_0.2.1 R6_2.5.1 tools_4.1.2 glue_1.5.1
[9] bit64_4.0.5 vctrs_0.3.8 RJDBC_0.2-8 rlang_0.4.12
[13] rJava_1.0-5 AWR.Athena_2.0.7-0 purrr_0.3.4
And as far as I understand, all requierements are fulfilled to use datasets
R version 4.1.2
Platform: x86_64-pc-linux-gnu (64-bit)
arrow_6.0.1
> .Machine$sizeof.pointer < 8 [1] FALSE > getRversion() < "4.0.0" [1] FALSE > tolower(Sys.info()[["sysname"]]) == "windows" [1] FALSE >
Nevertheless I get
Error: This build of the arrow package does not support Datasets
in return when
arrow::open_dataset(sources = path)
Appreciate any help!