Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.4.1
Description
When 2 tasks try to compute same rdd with replication level of 2 and running on only 2 executors. Deadlock will happen.
Task only release lock after writing into local machine and replicate to remote executor.
Time | Exe 1 (Task Thread T1) | Exe 1 (Shuffle Server Thread T2) | Exe 2 (Task Thread T3) | Exe 2 (Shuffle Server Thread T4) |
---|---|---|---|---|
T0 | write lock of rdd | |||
T1 | write lock of rdd | |||
T2 | replicate -> UploadBlockSync (blocked by T4) | |||
T3 | Received UploadBlock request from T1 (blocked by T4) | |||
T4 | replicate -> UploadBlockSync (blocked by T2) | |||
T5 | Received UploadBlock request from T3 (blocked by T1) | |||
T6 | Deadlock | Deadlock | Deadlock | Deadlock |
Attachments
Issue Links
- links to