SPARK-27463, some refactoring was made. There are two common base abstract classes were introduced:
The problem is that R code path is being matched with Python side:
I would like to match the hierarchy and decouple other stuff for now. Ideally we should deduplicate both code paths. Internal implementation is also similar intentionally.
Problem is that, R (with Arrow optimization, in particular) has some duplicated codes with Pandas UDFs.
FlatMapGroupsInRWithArrowExec <> FlatMapGroupsInPandasExec
MapPartitionsInRWithArrowExec <> ArrowEvalPythonExec
In order to prepare deduplication here as well, it might better avoid changing hierarchy alone in Python sides but just rather decouple it.