Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-6793

[R] Arrow C++ binary packaging for Linux

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.16.0
    • Component/s: R

      Description

      Our current installation experience on Linux isn't ideal. Unless you've already installed the Arrow C++ library, when you install the R package, you get a shell that tells you to install the C++ library. That was a useful approach to allow us to get the package on CRAN, which makes it easy for macOS and Windows users to install, but it doesn't improve the installation experience for Linux users. This is an impediment to adoption of arrow not only by users but also by package maintainers who might want to depend on arrow. 

      macOS and Windows have a better experience because at installation time, the configure scripts download and statically link a prebuilt C++ library. CRAN bundles the whole thing up and delivers that as a binary R package. 

      Python wheels do a similar thing: they're binaries that contain all external dependencies. And there are pyarrow wheels for Linux. This suggests that we could do something similar for R: build a generic Linux binary of the C++ library and download it in the R package configure script at install time.

      I experimented with using the Arrow C++ binaries included in the Python wheels in R. See discussion at the end of ARROW-5956. This worked on macOS (not useful for R, but it proved the concept) and almost worked on Linux, but it turned out that the "manylinux2010" standard is too archaic to work with contemporary Rcpp. 

      Proposal: do a similar workflow to what the manylinux2010 pyarrow build does, just with slightly more modern compiler/settings. Publish that C++ binary package to bintray. Then download it in the R configure script if a local/system package isn't found.

      Once we have a basic version working, test against various distros on R-hub to make sure we're solid everywhere and/or ensure the current fallback behavior when we encounter a distro that this doesn't work for. If necessary, we can make multiple flavors of this C++ binary for debian, centos, etc.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                npr Neal Richardson
                Reporter:
                npr Neal Richardson
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 13h 50m
                  13h 50m