Description
gapply() applies an R function on groups grouped by one or more columns of a DataFrame, and returns a DataFrame. It is like GroupedDataSet.flatMapGroups() in the Dataset API.
Two API styles are supported:
1.
gd <- groupBy(df, col1, ...) gapply(gd, function(grouping_key, group) {}, schema)
2.
gapply(df, grouping_columns, function(grouping_key, group) {}, schema)
R function input: grouping keys value, a local data.frame of this grouped data
R function output: local data.frame
Schema specifies the Row format of the output of the R function. It must match the R function's output.
Note that map-side combination (partial aggregation) is not supported, user could do map-side combination via dapply().