Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-10463

[R] Better messaging for currently unsupported CSV options in open_dataset

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.0.0
    • 3.0.0
    • R

    Description

      While read_csv_arrow()'s signature matches readr,  the readr_to_csv_parse_options() function (called by way of open_dataset()) only appears to capture a subset of those options:

      (https://github.com/apache/arrow/blob/883eb572bc64430307112895976ba79df10c8c7d/r/R/csv.R#L464)

      readr_to_csv_parse_options <- function(delim = ",",
       quote = '"',
       escape_double = TRUE,
       escape_backslash = FALSE,
       skip_empty_rows = TRUE)

      I ran into this trying to use a non-standard 'na' value:

       

      open_dataset("/path/to/csv/directory/", schema = sch, partitioning=NULL, format="csv", delim=";", na="\\N", escape_backslash=TRUE, escape_double=FALSE`)
      Error in readr_to_csv_parse_options(...) : unused argument (na = "\\N")
      

       

      Attachments

        Issue Links

          Activity

            People

              icook Ian Cook
              GabeTheEngineer Gabriel Bassett
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 5h 20m
                  5h 20m