Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
None
-
None
-
None
-
None
Description
Today, task attempt events need to go through verteximpl before reaching the task in order to maintain ordering guarantees for recovery. This causes these events to be routed twice through the dispatcher. This can cause overhead delays in large jobs. Also, this makes assumptions about event ordering which make the system fragile. Recovery should work independently of other system interactions so that evolution of other components is not affected by recovery unless it affects recovery logically.
Attachments
Issue Links
- blocks
-
TEZ-2418 TASK_ATTEMPT_FAILED_EVENT and TASK_COMPLETED_EVENT should move back to direct routing to attempt
- Resolved