Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-6960

[R] Add support for more compression codecs in Windows build

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 0.15.0
    • Fix Version/s: 0.16.0
    • Component/s: R
    • Environment:
      Windows 10

      Description

      When I attempt to write a parquet file using lz4, zstd, or brotli compression using R arrow 0.15.0, I am unable to do so due to the codec support not being built (example below).

       

      > arrow::write_parquet(payout_strategy, sink = "records_test_lz4.parquet",compression = "lz4")
      Error in parquet___arrow___FileWriter__WriteTable(self, table, chunk_size) : 
       Arrow error: IOError: Arrow error: NotImplemented: LZ4 codec support not built

       

      I believe that the error is generated through https://github.com/apache/arrow/blob/master/cpp/src/arrow/util/compression.cc#L124-L145, but I am not sure how to call 

      install.packages("arrow")

      in R to enable the ARROW_WITH_ZSTD/LZ4/BROTLI flags, or whether I should be doing installing zstd separately from arrow and then doing something pre- or post-install to link zstd with arrow. From https://github.com/apache/arrow/issues/1209, it appears that zstd support has been added to arrow and parquet in general, and the R package readme (https://github.com/apache/arrow/tree/master/r) notes "On macOS and Windows, installing a binary package from CRAN will handle Arrow's C++ dependencies for you", but I get the sense that does not apply to zstd.

       

      Is there guidance as to how to enable zstd and other compression codecs prior to or after downloading the R arrow package? Could this be added to the R documentation somewhere for future reference?

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                gngu Grant Nguyen
                Reporter:
                gngu Grant Nguyen
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 50m
                  1h 50m