[FLINK-11937] Resolve small file problem in RocksDB incremental checkpoint - ASF JIRA

Attach files

Attach Screenshot

Add vote

Voters

Watch issue

Watchers

Create sub-task

Link

Clone

Update Comment Author

Replace String in Comment

Update Comment Visibility

Delete Comments

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: 2.0.0
Component/s: Runtime / Checkpointing
Labels:
- auto-unassigned
- pull-request-available

Description

Currently when incremental checkpoint is enabled in RocksDBStateBackend a separate file will be generated on DFS for each sst file. This may cause “file flood” when running intensive workload (many jobs with high parallelism) in big cluster. According to our observation in Alibaba production, such file flood introduces at lease two drawbacks when using HDFS as the checkpoint storage FileSystem: 1) huge number of RPC request issued to NN which may burst its response queue; 2) huge number of files causes big pressure on NN’s on-heap memory.

In Flink we ever noticed similar small file flood problem and tried to resolved it by introducing ByteStreamStateHandle(~~FLINK-2808~~), but this solution has its limitation that if we configure the threshold too low there will still be too many small files, while if too high the JM will finally OOM, thus could hardly resolve the issue in case of using RocksDBStateBackend with incremental snapshot strategy.

We propose a new OutputStream called FileSegmentCheckpointStateOutputStream(FSCSOS) to fix the problem. FSCSOS will reuse the same underlying distributed file until its size exceeds a preset threshold. We
plan to complete the work in 3 steps: firstly introduce FSCSOS, secondly resolve the specific storage amplification issue on FSCSOS, and lastly add an option to reuse FSCSOS across multiple checkpoints to further reduce the DFS file number.

More details please refer to the attached design doc.

Attachments

Issue Links

Add Link

links to

Design doc

Delete this link

GitHub Pull Request #8751

Delete this link

Sub-Tasks

Create Sub-Task

1.	Introduce all the new components to support FSCSOS	Open	Unassigned	Actions
2.	Adjust components that need be modified to support FSCSOS	Open	Unassigned	Actions
3.	Reduce storage amplification for FSCSOS	Open	Unassigned	Actions
4.	Reuse single FileSegmentCheckpointStateOutputStream for multiple checkpoints	Open	Unassigned	Actions

Activity

Comment

This comment will be Viewable by All Users Viewable by All Users

Cancel

People

Assignee:: Unassigned

Reporter:: Congxian Qiu

Votes:: 0 Vote for this issue

Watchers:: 29 Start watching this issue

Dates

Created:: 16/Mar/19 08:01

Updated:: 10/Jul/24 06:57

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

0.5h

Include sub-tasks

Agile

View on Board

Resolve small file problem in RocksDB incremental checkpoint

Details

Description

Attachments

Attachments

Issue Links

Sub-Tasks

Activity

People

Dates

Time Tracking

Agile

Slack

Issue deployment