Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
Currently, if one output of an upstream job vertex is consumed by multiple downstream job vertices, the upstream vertex will produce multiple dataset. For blocking shuffle, it means serialize and persist the same data multiple times. This ticket aims to optimize this behavior and make the upstream job vertex produce one dataset which will be read by multiple downstream vertex.
Attachments
Issue Links
- links to