Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-17738

[R] dplyr::compute should convert from grouped arrow_dplyr_query to arrow Table

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 9.0.0
    • 10.0.0
    • R

    Description

      It is expected that dplyr::compute() will perform the calculation on the arrow dplyr query and convert it to a Table, but it does not seem to work correctly for grouped arrow dplyr queries and does not result in a Table.

      mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> dplyr::compute() |> class()
      #> [1] "arrow_dplyr_query"
      mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> dplyr::ungroup() |> dplyr::compute() |> class()
      #> [1] "Table"        "ArrowTabular" "ArrowObject"  "R6"
      

      as_arrow_table() works fine.

      mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> class()
      #> [1] "arrow_dplyr_query"
      mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> dplyr::compute() |> class()
      #> [1] "arrow_dplyr_query"
      mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> dplyr::collect(FALSE) |> class()
      #> [1] "arrow_dplyr_query"
      mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> arrow::as_arrow_table() |> class()
      #> [1] "Table"        "ArrowTabular" "ArrowObject"  "R6"
      

      It seems to revert to arrow dplyr query in the following line.
      https://github.com/apache/arrow/blob/7cfdfbb0d5472f8f8893398b51042a3ca1dd0adf/r/R/dplyr-collect.R#L73-L75

       

      Attachments

        Issue Links

          Activity

            People

              eitsupi SHIMA Tatsuya
              eitsupi SHIMA Tatsuya
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1.5h
                  1.5h