Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-18352

[R] Datasets API interface improvements

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • R
    • None

    Description

      Umbrella ticket for improvements for our interface to the datasets API, and making the experience more consistent between open_dataset() and the read_*() functions.

      The current parameters which are supported in read_delim_arrow() but not in open_dataset are:

      • file
      • col_names
      • col_select
      • na
      • quoted_na
      • parse_options/convert_options/read_options*
      • as_data_frame

      Subtasks 2, 5, 6, 7, 8, and 10 below allow us to support all of the read_csv_arrow() options in open_dataset() for CSVs, or give helpful error messages when options aren't supported.

      Attachments

        Activity

          People

            Unassigned Unassigned
            thisisnic Nicola Crane
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 6h 40m
                6h 40m