[FLINK-26306] [Changelog] Thundering herd problem with materialization - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 1.15.0
Fix Version/s: 1.15.0
Component/s: Runtime / State Backends
Labels:
- pull-request-available

Description

Quick note: CheckpointCleaner is not involved here.

When a checkpoint is subsumed, SharedStateRegistry schedules its unused shared state for async deletion. It uses common IO pool for this and adds a Runnable per state handle. ( see SharedStateRegistryImpl.scheduleAsyncDelete)

When a checkpoint is started, CheckpointCoordinator uses the same thread pool to initialize the location for it. (see CheckpointCoordinator.initializeCheckpoint)

The thread pool is of fixed size jobmanager.io-pool.size; by default it's the number of CPU cores) and uses FIFO queue for tasks.

When there is a spike in state deletion, the next checkpoint is delayed waiting for an available IO thread.

Back-pressure seems reasonable here (similar to CheckpointCleaner); however, this shared state deletion could be spread across multiple subsequent checkpoints, not neccesarily the next one.

----

I believe the issue is an pre-existing one; but it particularly affects changelog state backend, because 1) such spikes are likely there; 2) workloads are latency sensitive.

In the tests, checkpoint duration grows from seconds to minutes immediately after the materialization.

Attachments

Issue Links

Discovered while testing

FLINK-25144 Test Changelog State Backend manually

Resolved

is cloned by

FLINK-26590 Triggered checkpoints can be delayed by discarding shared state

Open

is related to

FLINK-13698 Rework threading model of CheckpointCoordinator

Reopened

links to

GitHub Pull Request #18989

Activity

People

Assignee:: Roman Khachatryan

Reporter:: Roman Khachatryan

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 22/Feb/22 16:05

Updated:: 10/Mar/22 15:09

Resolved:: 10/Mar/22 15:09