Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
4.0.0
Description
AQE can materialize both ShuffleQueryStage and BroadcastQueryStage on the cancellation. This causes unnecessary stage materialization by submitting Shuffle Job and Broadcast Job. Under normal circumstances, if the stage is already non-materialized (a.k.a ShuffleQueryStage.shuffleFuture orĀ BroadcastQueryStage.broadcastFuture is not initialized yet), it should just be skipped without materializing it.
Please find sample use-case:
1- Stage Materialization Steps:
When stage materialization is failed:
1.1- ShuffleQueryStage1 - is materialized successfully, 1.2- ShuffleQueryStage2 - materialization is failed, 1.3- ShuffleQueryStage3 - Not materialized yet so ShuffleQueryStage3.shuffleFuture is not initialized yet
2- Stage Cancellation Steps:
2.1- ShuffleQueryStage1 - is canceled due to already materialized, 2.2- ShuffleQueryStage2 - is earlyFailedStage so currently, it is skipped as default by AQE because it could not be materialized, 2.3- ShuffleQueryStage3 - Problem is here: This stage is not materialized yet but currently, it is also tried to cancel and this stage requires to be materialized first.