Details
-
Bug
-
Status: Open
-
Not a Priority
-
Resolution: Unresolved
-
None
-
None
Description
We've seen timeouts which look like they are induced by large accumulator payloads. Removing the accumulators stabilized the cluster.
IMHO the heartbeat should not contain the accumulator payload. Accumulators should be handled separately.
Attachments
Issue Links
- causes
-
BEAM-8962 FlinkMetricContainer causes churn in the JobManager and lets the web frontend malfunction
- Triage Needed
- is related to
-
FLINK-15253 Accumulators are not checkpointed
- Open