Details
-
Sub-task
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
3.2.0
-
None
-
None
Description
Per discussion in https://github.com/apache/spark/pull/32210#issuecomment-823503243 , we can introduce some kind of HybridJoin operator in AQE, and we can choose to do shuffled hash join vs sort merge join for each task independently, e.g. based on partition size, task1 can do shuffled hash join, and task2 can do sort merge join, etc.