Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-19008

Flink Job runs slow after restore + downscale from an incremental checkpoint (rocksdb)

    XMLWordPrintableJSON

Details

    Description

      A customer runs a Flink job with RocksDB state backend. Checkpoints are retained and done incrementally. The state size is several TB. When they restore + downscale from a retained checkpoint, although the downloading of checkpoint files took ~20min, the job throughput returns to the expected level only after 3 hours.  

      I do not have RocksDB logs. The suspicion for those 3 hours is due to heavy RocksDB compaction and/or flush. As it was observed that checkpoint could not finish faster enough due to long checkpoint duration (sync). How can we make this restoring phase shorter? 

      For compaction, I think it is worth to check the improvement of:

      CompactionPri compaction_pri = kMinOverlappingRatio;

      which has been set to default in RocksDB 6.x:

      // In Level-based compaction, it Determines which file from a level to be
      // picked to merge to the next level. We suggest people try
      // kMinOverlappingRatio first when you tune your database.
      enum CompactionPri : char {
        // Slightly prioritize larger files by size compensated by #deletes
        kByCompensatedSize = 0x0,
        // First compact files whose data's latest update time is oldest.
        // Try this if you only update some hot keys in small ranges.
        kOldestLargestSeqFirst = 0x1,
        // First compact files whose range hasn't been compacted to the next level
        // for the longest. If your updates are random across the key space,
        // write amplification is slightly better with this option.
        kOldestSmallestSeqFirst = 0x2,
        // First compact files whose ratio between overlapping size in next level
        // and its size is the smallest. It in many cases can optimize write
        // amplification.
        kMinOverlappingRatio = 0x3,
      };
      ...
      // Default: kMinOverlappingRatio  
      CompactionPri compaction_pri = kMinOverlappingRatio;

      Attachments

        Activity

          People

            ym Yuan Mei
            qinjunjerry Jun Qin
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated: