Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.19.0
-
None
Description
At present, Hybrid Shuffle and Adaptive Query Execution (AQE), which includes features such as Dynamic Partition Pruning (DPP), Runtime Filter, and Adaptive Batch Scheduler, are not fully compatible. While they can be used concurrently at the same time, the activation of AQE inhibits the key capability of Hybrid Shuffle to perform simultaneous reading and writing. This limitation arises because AQE dictates that downstream tasks may only initiate once upstream tasks have finished, a requirement that is inconsistent with the simultaneous read-write process facilitated by Hybrid Shuffle. In addition, Hybrid Shuffle will restart the whole job when failover, which is also an essential issue for production usage.
To harness the full potential of Hybrid Shuffle and AQE, it is essential to refine their integration. By doing so, we can capitalize on each feature's distinct advantages and enhance overall system performance.
Attachments
1.
|
Support consuming multiple subpartitions on a single channel | Closed | Yunfeng Zhou | |
2.
|
Hybrid shuffle avoids restarting the whole job when failover | Open | Unassigned | |
3.
|
Dynamically choose hybrid shuffle or AQE in a job level | Open | Unassigned | |
4.
|
More precise dynamic selection of Hybrid Shuffle or AQE | Open | Unassigned |