Description
Kudu has a (misleadingly named) metric called on_disk_size defined in tablet.cc with the metric description "Tablet size on disk".
The current implementation (as of 1.3.1) is that this metric only counts bytes contained in the base data and the REDO deltas in the DiskRowSets in addition to the data in the MemRowSet. It does not include UNDO deltas. Also not included is data in the WALs and other metadata files.
The easy thing to do to improve this situation is change the description of the metric to be "Space used by this tablet's data blocks" and add UNDO deltas to the count. However that would be a 2-step process.
The metric is currently tied to Tablet::EstimateOnDiskSize(). If you trace that down to the DiskRowSet you will end up at a function in DiskRowSet:
uint64_t DiskRowSet::EstimateOnDiskSize() const { DCHECK(open_); shared_lock<rw_spinlock> l(component_lock_); return base_data_->EstimateOnDiskSize() + delta_tracker_->EstimateOnDiskSize(); }
In the DeltaTracker, you can see that we are only counting REDO deltas, not UNDO deltas:
uint64_t DeltaTracker::EstimateOnDiskSize() const { shared_lock<rw_spinlock> lock(component_lock_); uint64_t size = 0; for (const shared_ptr<DeltaStore>& ds : redo_delta_stores_) { size += ds->EstimateSize(); } return size; }
However, this function is used by the MM op MajorDeltaCompactionOp::UpdateStats() which eventually calls into double DiskRowSet::DeltaStoresCompactionPerfImprovementScore(). That function calls into EstimateDeltaDiskSize() which has the following implementation:
uint64_t DiskRowSet::EstimateDeltaDiskSize() const { DCHECK(open_); shared_lock<rw_spinlock> l(component_lock_); return delta_tracker_->EstimateOnDiskSize(); }
So in order not to break that estimation we will need to separate the two, such that we provide a way to estimate the Redo delta size separately from the size of all of the deltas in a RowSet.
Attachments
Issue Links
- blocks
-
KUDU-1755 Improve tablet disk space estimation
- Resolved