[KUDU-3147] Balance tablets based on range hash buckets - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 1.12.0
Fix Version/s: 1.17.0
Component/s: master, perf
Labels:

Description

When a user defines a schema that uses range + hash partitioning its is often the case that the tablets in the latest range, based on time or any semi-sequential data, are the only tablets that receive writes. Or even if not the latest, it is common for a single range to receive a burst of writes if backloading.

This is so common, that the default Kudu balancing scheme should consider placing/rebalancing the tablets for the hash buckets within each range on as many servers as possible in order to support the maximum write throughput. In that case, `min(#buckets, #total-cluster-tservers)` tservers will be used to handle the writes if the cluster is perfectly balanced. Today, even if perfectly balanced, it is possible for all the hash buckets to be on a single tserver.

Attachments

Issue Links

is related to

KUDU-2823 Place tablet replicas based on dimension

Resolved

KUDU-2974 Make the rebalancer honor dimension-based replica placement

Resolved

relates to

KUDU-3061 Balance tablet leaders across TServers

Open

Activity

People

Assignee:: Ravi Bhanot

Reporter:: Grant Henke

Votes:: 2 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 09/Jun/20 03:20

Updated:: 08/Apr/22 21:16

Resolved:: 08/Apr/22 21:16