Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
4.0.0
Description
`AccumulableInfo` is one of the top heap consumers in driver's heap dumps for stages with many tasks. For a stage with a large number of tasks (O(100k)), we saw {}30%{} of the heap usage stemming from `TaskInfo.accumulables()`.
The `TaskSetManager` today keeps around the TaskInfo objects (ref1, ref2)) and in turn the task metrics (`AccumulableInfo`) for every task attempt until the stage is completed. This means that for stages with a large number of tasks, we keep metrics for all the tasks (`AccumulableInfo`) around even when the task has completed and its metrics have been aggregated. Given a task has a large number of metrics, stages with many tasks end up with a large heap usage in the form of task metrics.
Ideally, we should clear up a task's TaskInfo upon the task's completion, thereby reducing the driver's heap usage.
Attachments
Attachments
Issue Links
- links to