[FLINK-19008] Flink Job runs slow after restore + downscale from an incremental checkpoint (rocksdb) - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Reopened
Priority: Not a Priority
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: Runtime / State Backends
Labels:

Description

A customer runs a Flink job with RocksDB state backend. Checkpoints are retained and done incrementally. The state size is several TB. When they restore + downscale from a retained checkpoint, although the downloading of checkpoint files took ~20min, the job throughput returns to the expected level only after 3 hours.

I do not have RocksDB logs. The suspicion for those 3 hours is due to heavy RocksDB compaction and/or flush. As it was observed that checkpoint could not finish faster enough due to long checkpoint duration (sync). How can we make this restoring phase shorter?

For compaction, I think it is worth to check the improvement of:

CompactionPri compaction_pri = kMinOverlappingRatio;

which has been set to default in RocksDB 6.x:

// In Level-based compaction, it Determines which file from a level to be
// picked to merge to the next level. We suggest people try
// kMinOverlappingRatio first when you tune your database.
enum CompactionPri : char {
  // Slightly prioritize larger files by size compensated by #deletes
  kByCompensatedSize = 0x0,
  // First compact files whose data's latest update time is oldest.
  // Try this if you only update some hot keys in small ranges.
  kOldestLargestSeqFirst = 0x1,
  // First compact files whose range hasn't been compacted to the next level
  // for the longest. If your updates are random across the key space,
  // write amplification is slightly better with this option.
  kOldestSmallestSeqFirst = 0x2,
  // First compact files whose ratio between overlapping size in next level
  // and its size is the smallest. It in many cases can optimize write
  // amplification.
  kMinOverlappingRatio = 0x3,
};
...
// Default: kMinOverlappingRatio  
CompactionPri compaction_pri = kMinOverlappingRatio;

Attachments

Activity

People

Assignee:: Yuan Mei

Reporter:: Jun Qin

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 20/Aug/20 21:38

Updated:: 04/Feb/22 22:36