Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-10463

[R] Better messaging for currently unsupported CSV options in open_dataset

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.0.0
    • Fix Version/s: 3.0.0
    • Component/s: R

      Description

      While read_csv_arrow()'s signature matches readr,  the readr_to_csv_parse_options() function (called by way of open_dataset()) only appears to capture a subset of those options:

      (https://github.com/apache/arrow/blob/883eb572bc64430307112895976ba79df10c8c7d/r/R/csv.R#L464)

      readr_to_csv_parse_options <- function(delim = ",",
       quote = '"',
       escape_double = TRUE,
       escape_backslash = FALSE,
       skip_empty_rows = TRUE)

      I ran into this trying to use a non-standard 'na' value:

       

      open_dataset("/path/to/csv/directory/", schema = sch, partitioning=NULL, format="csv", delim=";", na="\\N", escape_backslash=TRUE, escape_double=FALSE`)
      Error in readr_to_csv_parse_options(...) : unused argument (na = "\\N")
      

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                icook Ian Cook
                Reporter:
                GabeTheEngineer Gabriel Bassett
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 5h 20m
                  5h 20m