[KUDU-2014] Explore additional approaches to improve LBM startup time - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: In Progress
Priority: Major
Resolution: Unresolved
Affects Version/s: 1.4.0
Fix Version/s: None
Component/s: fs
Labels:
- data-scalability
- roadmap-candidate

Description

The fix for ~~KUDU-1549~~ added support for deleting full log block manager containers with no live blocks, and for compacting container metadata to omit CREATE/DELETE record pairs. Both of these will help reduce the amount of metadata that must be read at startup. However, there's more we can do to help; this JIRA captures some additional ideas worth exploring (if/when LBM startup once again becomes intolerable):

In this gerrit, Todd made the case that container metadata processing is seek-dominant:

looking at a data/ dir on a cluster that has been around for quite some time, most of the metadata files seem to be around 400KB. Assuming 100MB/sec sequential throughput and 10ms seek, it definitely seems like the startup time would be seek-dominated (10 or 20ms seek depending whether various internal metadata pages are hot in cache, plus only 4ms of sequential read time).

We theorized several ways to reduce seeking, all focused on reducing the number of discrete container metadata files read at startup:

Raise the container max data file size. This won't help on older versions of el6 with ext4, but will help everywhere else. It makes sense for the max data file size to be a function of the disk size anyway. And it's a pretty cheap way to extract more scalability.
Reuse container data file holes, explicitly to avoid creating so many containers. Perhaps with a round of "defragmentation" to simplify reuse, or perhaps not. As a side effect, metadata file compaction now becomes more important (and costly).
Eschew one metadata file per data file altogether and maintain just one metadata file. Deleting "dead" containers would no longer be an improvement for metadata startup cost. Metadata compaction would be a lot more expensive. Block records themselves would be larger, because each record now needs to point to a particular data file, though this can be mitigated in various ways. A variant of this would be to do away with the 1-1 relationship between metadata and data files and make it more like m-n.
Reduce the number of extents in container metadata files via judicious preallocation.

See the gerrit linked above for more details.

Attachments

Issue Links

Blocked

KUDU-2977 Sharding block map of LogBlockManager

Resolved

is duplicated by

KUDU-2654 log block manager very slow

Resolved

is related to

KUDU-1549 LBM should start up faster

Resolved

relates to

KUDU-2636 LBM should delete full and dead containers at runtime

Resolved

Activity

People

Assignee:: Ashwani Raina

Reporter:: Adar Dembo

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 15/May/17 23:19

Updated:: 29/Sep/22 12:31