Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-5956

[R] Ability for R to link to C++ libraries from pyarrow Wheel

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Invalid
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: R
    • Labels:
      None
    • Environment:
      Ubuntu 16.04, R 3.4.4, python 3.6.5

      Description

      I have installed pyarrow 0.14.0 and want to be able to also use R arrow. In my work I use rpy2 a lot to exchange python data structures with R data structures, so would like R arrow to link against the exact same .so files found in pyarrow

       

       

      When I pass in include_dir and lib_dir to R's configure, pointing to pyarrow's include and pyarrow's root directories, I am able to compile R's arrow.so file. However, I am unable to load it in an R session, getting the error:

       

      > dyn.load('arrow.so')
      Error in dyn.load("arrow.so") :
       unable to load shared object '/tmp/arrow2/r/src/arrow.so':
       /tmp/arrow2/r/src/arrow.so: undefined symbol: _ZNK5arrow11StructArray14GetFieldByNameERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

       

       

      Steps to reproduce:

       

      Install pyarrow, which also ships libarrow.so and libparquet.so

       

      pip3 install pyarrow --upgrade --user
      PY_ARROW_PATH=$(python3 -c "import pyarrow, os; print(os.path.dirname(pyarrow.__file__))")
      PY_ARROW_VERSION=$(python3 -c "import pyarrow; print(pyarrow.__version__)")
      ln -s $PY_ARROW_PATH/libarrow.so.14 $PY_ARROW_PATH/libarrow.so
      ln -s $PY_ARROW_PATH/libparquet.so.14 $PY_ARROW_PATH/libparquet.so
      

       

       

      Add to LD_LIBRARY_PATH

       

      sudo tee -a /usr/lib/R/etc/ldpaths <<LINES
      LD_LIBRARY_PATH="\${LD_LIBRARY_PATH}:$PY_ARROW_PATH"
      export LD_LIBRARY_PATH
      LINES
      sudo tee -a /usr/lib/rstudio-server/bin/r-ldpath <<LINES
      LD_LIBRARY_PATH="\${LD_LIBRARY_PATH}:$PY_ARROW_PATH"
      export LD_LIBRARY_PATH
      LINES
      export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:$PY_ARROW_PATH"
      

       

       

      Install r arrow from source

      git clone https://github.com/apache/arrow.git /tmp/arrow2
      cd /tmp/arrow2/r
      git checkout tags/apache-arrow-0.14.0
      R CMD INSTALL ./ --configure-vars="INCLUDE_DIR=$PY_ARROW_PATH/include LIB_DIR=$PY_ARROW_PATH"

       

      I have noticed that the R package for arrow no longer has an RcppExports, but instead an arrowExports. Could it be that the lack of RcppExports has made it difficult to find GetFieldByName?

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                jeffreyw Jeffrey Wong
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: