Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-14902

[R] Update write_csv_arrow() to support all args of readr::write_csv()



    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • R
    • None


      Currently (arrow version 6.0.1 and readr version 2.1.0) we only support a few of the readr::write_csv() arguments. Once ARROW-13623 is fixed write_csv_arrow() will error if the user passes unsupported readr arguments. 

      The following arguments need CsvWriteOptions (see linked issues) in order to be exposed to R users:

      • na: string used for missing values. Defaults to NA. Missing values are never quoted; strings with the same value as na will always be quoted.
      • append: boolean. If {[FALSE}} will overwrite existing file. If TRUE will append to existing file. In both cases, if the file doesn't exist, a new file is created.
      • quote: how to handle fields which contain characters that need to be quoted:
        • needed: only quote fields which need them
        • all: quote all fields - I think this might be the implicit default behaviour for `write_csv_arrow()`
        • none: never quote fields
      • escape: the type of escape to use when quotes are in the data:
        • double: quotes are escaped by doubling them
        • backslash: quotes are escaped by a preceding backslash
        • none: quotes are not escaped
      • eol: the end of line character to use. Most commonly either "\n" for Unix style newlines, or "\r\n" for Windows style newlines.

      Once these are enabled, update the signature of `write_csv_arrow()` and compare written files.
      From ARROW-13623 "I noticed we had a difference in quoting: readr doesn't quote strings by default but we do." Once we have more control over quoting, we could write some tests to make sure default behaviours between write_csv_arrow() and {{readr::write_csv()}} match.


        Issue Links



              Unassigned Unassigned
              dragosmg Dragoș Moldovan-Grünfeld
              0 Vote for this issue
              2 Start watching this issue



                Time Tracking

                  Original Estimate - Not Specified
                  Not Specified
                  Remaining Estimate - 0h
                  Time Spent - 15h 40m
                  15h 40m