Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-14919

[R] write_parquet() drops attributes for grouped dataframes

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 6.0.1
    • 7.0.0
    • R
    • Linux

    Description

      Reprex:

      library(dplyr)
      #>
      #> Attaching package: 'dplyr'
      #> The following objects are masked from 'package:stats':
      #>
      #>     filter, lag
      #> The following objects are masked from 'package:base':
      #>
      #>     intersect, setdiff, setequal, union
      library(purrr)
      library(arrow)
      #>
      #> Attaching package: 'arrow'
      #> The following object is masked from 'package:utils':
      #>
      #>     timestamp
      attr(mtcars, "etag") <- "test"
      mtcars_grouped <-
        mtcars |>
        group_by("cyl")write_parquet(mtcars, "mtcars.parquet")
      read_parquet("mtcars.parquet") |>
        attributes() |>
        pluck("etag")
      #> [1] "test"write_parquet(mtcars_grouped, "mtcars_grouped.parquet")
      read_parquet("mtcars_grouped.parquet") |>
        attributes() |>
        pluck("etag")
      #> NULLunlink("mtcars_grouped.parquet")
      unlink("mtcars.parquet") 

      <sup>Created on 2021-11-30 by the [reprex package](https://reprex.tidyverse.org) (v2.0.1)</sup>

      Attributes are preserved for ungrouped data but unexpectedly dropped for grouped data. May affect other read/write formats.

      Attachments

        Issue Links

          Activity

            People

              thisisnic Nicola Crane
              milesmcbain Miles McBain
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 10m
                  2h 10m