Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-18181 [R] read_csv_arrow() Improvements
  3. ARROW-18049

[R] Support column renaming in col_select argument to file reading functions

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • R

    Description

      We should support the ability to rename columns when reading in data via the CSV/Parquet/Feather/JSON file readers.

      We currently have an argument col_select, which allows users to choose which columns to read in, but renaming doesn't work.

      To implement this, we'd need to check if any columns have been renamed by col_select and then updating the schema of the object being returned once the file has been read.

      
      library(readr)
      library(arrow)
      readr::read_csv(readr_example("mtcars.csv"), col_select = c(not_hp = hp))
      #> # A tibble: 32 × 1
      #>    not_hp
      #>     <dbl>
      #>  1    110
      #>  2    110
      #>  3     93
      #>  4    110
      #>  5    175
      #>  6    105
      #>  7    245
      #>  8     62
      #>  9     95
      #> 10    123
      #> # … with 22 more rows
      arrow::read_csv_arrow(readr_example("mtcars.csv"), col_select = c(not_hp = hp))
      #> # A tibble: 32 × 1
      #>       hp
      #>    <int>
      #>  1   110
      #>  2   110
      #>  3    93
      #>  4   110
      #>  5   175
      #>  6   105
      #>  7   245
      #>  8    62
      #>  9    95
      #> 10   123
      #> # … with 22 more rows
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            thisisnic Nicola Crane
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: