Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-18352 [R] Datasets API interface improvements
  3. ARROW-18236

[R] Improve error message when providing a mix of readr and Arrow options

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • R
    • None

    Description

      I was trying to solve a user issue today and tried to run the following code:

      df = tibble(x = c("a","b",  ""  , "d"))
      write_tsv(df, "data.tsv")
      open_dataset("data.tsv", format="tsv", skip_rows=1, schema=schema(x=string()), skip_empty_rows = TRUE) %>%
        collect()
      

      which gives me the error

      Error: Use either Arrow parse options or readr parse options, not both
      

      which is somewhat obnoxious as I have literally no context provided to know which options are being referred to and what the possible options are.

      Also, like, why can't we have a mix of both? This is a totally valid use-case. I think both a code update and a more informative error message are needed here.

      Attachments

        Activity

          People

            Unassigned Unassigned
            thisisnic Nicola Crane
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: