[KUDU-3127] Have replica placement and rebalancing consider non-uniform hardware - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: CLI, master
Labels:
- roadmap-candidate

Description

We've seen multiple deployments suffer from the fact of life that data centers don't always have uniform hardware. Often times, racks are comprised of whatever hardware we can salvage from other projects. As such, Kudu's assumption that all tablet servers should be treated equally (sans location awareness) can be a bad one.

There are a few pieces to making this better:

Having Kudu determine the relative capacities of each tablet servers (either automatically, or as input by an operator)
Updating the replica placement policy to account for capacity across tablet servers
Bonus: have Kudu account for the current size used on each tablet server

Some things that might be worth considering:

It seems reasonable to assume that each data directory is independent of one another, so we should be able to determine with relative ease the total capacity of a server by aggregating the total capacities of its data directories. This doesn't account for colocated WAL directories, but that might be a fine limitation, since we expect WAL usage to vary wildly as ingest workloads vary. The capacity could be heartbeated to masters periodically, or fetched via RPC by rebalancer tooling.
Updating the placement policy seems trickier, since there are a lot of nice properties with using the PO2C algorithm (e.g. eventual fixing of skew), and with assuming that all tablets have equal weight (e.g. it's harder to fall into the trap of moving a replica, only to move it back). Some variant of PO2C, but based on available space instead of replica count might be worth considering for initial placement and for defining balance.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Andrew Wong

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 21/May/20 01:42

Updated:: 28/May/20 14:22