Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-13860

[R] arrow 5.0.0 write_parquet throws error writing grouped data.frame

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 5.0.0
    • 6.0.0
    • R
    • maxOS 11.1 Big Sur

    Description

      arrow 5.0.0 write_parquet throws error writing grouped data.frame.

      Here is how to reproduce it.

      library(dplyr)
      {{ arrow::write_parquet(mtcars %>% group_by(am),"/tmp/mtcars_test.parquet")}}
      # Error: x must be an object of class 'data.frame', 'RecordBatch', or 'Table', not 'arrow_dplyr_query’.

       

      With arrow 4.0.1, this used to work fine.

      library(dplyr)
      arrow::write_parquet(mtcars %>% group_by(am),"/tmp/mtcars_test.parquet")
      x <- arrow::read_parquet("/tmp/mtcars_test.parquet")
      x
      # A tibble: 32 x 11
      # Groups:   am [2]
      #     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
      # * <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
      # 1  21       6  160    110  3.9   2.62  16.5     0     1     4     4
      # 2  21       6  160    110  3.9   2.88  17.0     0     1     4     4
      # 3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1
      # 4  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1
      # 5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2
      # 6  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1
      # 7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4
      # …

       

       

      Attachments

        Issue Links

          Activity

            People

              npr Neal Richardson
              hideaki Hideaki Hayashi
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 40m
                  1h 40m