Description
Currently, the error message is confusing when the output schema type is not matched with the actual R DataFrame in gapply:
./bin/sparkR --conf spark.sql.execution.arrow.sparkr.enabled=true
df <- createDataFrame(list(list(a=1L, b="2"))) count(gapply(df, "a", function(key, group) { group }, structType("a int, b int")))
org.apache.spark.SparkException: Job aborted due to stage failure: Task 43 in stage 2.0 failed 1 times, most recent failure: Lost task 43.0 in stage 2.0 (TID 2, 192.168.35.193, executor driver): java.lang.UnsupportedOperationException at org.apache.spark.sql.vectorized.ArrowColumnVector$ArrowVectorAccessor.getInt(ArrowColumnVector.java:212) ...
We should probably also document that the type should be matched always.