Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2047

Lazy cfile open and maintenance op stat caching cause fruitful delta compaction ops to never run



    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.4.0
    • Fix Version/s: 1.12.0
    • Component/s: compaction, perf, tablet
    • Labels:


      I was just looking at a cluster which has a large amount of REDO data on some of its tablets, and wasn't sure why it wasn't ever compacting it. The issue appears to be the following:

      • in DiskRowSet::DeltaStoresCompactionPerfImprovementScore(), we call through to GetColumnIdsWithUpdates() to see which columns may need compaction
        • if the REDO delta block is not open (eg when the server has recently started), this will skip the unopened delta file stats and not include them in the result
        • we thus determine that the compaction is not fruitful

      This was a conscious decision to avoid the MM from eagerly opening every delta on its first pass through computing compaction stats. We figured that, if it were worth compacting, then probably someone would scan the data, forcing the deltas to get opened and thus made eligible for compaction.

      However, the MM tries to be smart about caching the statistics (see e7fe0c1a94cac364522c09b8208c98480947d794). In particular, if it sees that the tablet has not run any flushes or compactions, it won't bother to recalculate the stats, assuming they haven't changed.

      So, if you have a completely read-only tablet with some uncompacted deltas, the MM op will never run.




            • Assignee:
              awong Andrew Wong
              tlipcon Todd Lipcon
            • Votes:
              0 Vote for this issue
              4 Start watching this issue


              • Created: