Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
First at all, many thanks for your hard work! I was quite exited, when you guys implemented some basic function of the the dplyr package. Is there a why to combine tow or more arrow tables into one by rows or columns? At the moment my workaround looks like this:
dplyr::bind_rows( "a" = arrow.table.1 %>% dplyr::collect(), "b" = arrow.table.2 %>% dplyr::collect(), "c" = arrow.table.3 %>% dplyr::collect(), "d" = arrow.table.4 %>% dplyr::collect(), .id = "ID" ) %>% arrow::write_ipc_stream(sink = "file_name_combined_tables.arrow")
But this is actually not really a meaningful measure because of putting the data back as dataframes/tibbles into the r environment, which might lead to an exhaust of RAM space. Perhaps you might have a better workaround on hand. It would be great if you guys could implement the bind_rows and bind_cols methods provided by dplyr.
dplyr::bind_rows( "a" = arrow.table.1, "b" = arrow.table.2, "c" = arrow.table.3, "d" = arrow.table.4, .id = "ID" ) %>% arrow::write_ipc_stream(sink = "file_name_combined_tables.arrow")