Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
1.1.0
-
None
Description
(Prompted by this user post)
The on-disk size of tablets as reported by the Kudu web UI omits some minor as well as some major sources of space consumption. I'm listing them all here for posterity.
- Bloom file and composite index file usage. According to this gerrit (warning: internal link), it's because we also use the rowset estimate to determine how much IO will be generated were we to compact that rowset, and bloom/composite index files aren't touched in compaction.
- UNDO file usage. This seems like a more glaring omission, especially for mutation-heavy workloads like the one reported in the mailing list. But, the current REDO-only estimate factors into major delta compaction decision making by the maintenance manager, so maybe there's a good reason there too.
- Log block manager block size rounding. The LBM rounds up Kudu blocks to the nearest filesystem block size to improve hole punching space reclamation. A side effect is that some space is lost to external fragmentation.
- Log block manager metadata overhead. Every container has a .metadata file, and we don't factor that into space utilization.
- Other files, such as the tablet superblock, WAL segments, and cmeta.
I expect the first two items to be the largest, so we should work on addressing them. Lets decouple the UI-based estimate from the MM path so our reporting can be more accurate while still allowing the MM to make good decisions.
Attachments
Issue Links
- is blocked by
-
KUDU-2001 Metric on_disk_size does not include UNDO deltas
- Resolved
- is depended upon by
-
KUDU-1067 Add metrics for tablet row count, size on disk
- Resolved
- is related to
-
KUDU-1830 Reduce Kudu WAL log disk usage
- Resolved
- relates to
-
KUDU-624 Apparent data leak in log block manager on ITBLL cluster
- Resolved