XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Minor
Resolution: Invalid
Affects Version/s: 1.13.0
Fix Version/s: None
Component/s: Runtime / State Backends
Labels:
- auto-deprioritized-major

Description

1. Problem introduction and cause analysis

Problem description: The duration of UnionListState restore under large concurrency is more than 2 minutes.

the reason:

2000 subtasks write 2000 files during checkpoint, and each subtask needs to read 2000 files during restore.
2000*2000 = 4 million, so 4 million small files need to be read to hdfs during restore. HDFS has become a bottleneck, causing restore to be particularly time-consuming.

2. Optimize ideas

Under normal circumstances, the UnionListState state is relatively small. Typical usage scenario: Kafka offset information.
When restoring, JM can directly read all 2000 small files, merge UnionListState into a byte array and send it to all TMs to avoid frequent access to hdfs by TMs.

3. Benefits after optimization

Before optimization: 2000 concurrent, Kafka offset restore takes 90~130 s.
After optimization: 2000 concurrent, Kafka offset restore takes less than 1s.

4. Risk points

Too big UnionListState leads to too much pressure on JM.

Solution 1:
Add configuration and decide whether to enable this feature. The default is false, which means the old plan is used. When the user is set to true, JM will merge.

Solution 2:
The above configuration is not required, which is equivalent to enabling merge by default.
JM detects the size of the state before merge, and if it is less than the threshold, the state is considered to be relatively small, and the state is sent to all TMs through ByteStreamStateHandle.
If the threshold is exceeded, the state is considered to be greater. At this time, write an hdfs file, and send FileStateHandle to all TMs, and TM can read this file.

Note: Most of the scenarios where Flink uses UnionListState are Kafka offset (small state). In theory, most jobs are risk-free.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

JM 启动火焰图.svg
03/Mar/21 07:41
465 kB
Rui Fan
akka timeout Exception.png
03/Mar/21 07:41
336 kB
Rui Fan

Issue Links

is related to

FLINK-18203 Reduce objects usage in redistributing union states

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Rui Fan

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 22/Feb/21 11:02

Updated:: 16/Nov/22 08:30

Resolved:: 20/May/21 10:51