[KUDU-3124] A safer way to handle {Create,Delete}Tablet requests - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 1.2.0, 1.3.0, 1.3.1, 1.4.0, 1.5.0, 1.6.0, 1.7.0, 1.8.0, 1.7.1, 1.9.0, 1.10.0, 1.10.1, 1.11.0, 1.12.0, 1.11.1, 1.13.0
Fix Version/s: None
Component/s: master, tserver
Labels:
None

Description

As of now, catalog manager (a part of kudu-master) sends CreateTabletRequest and DeleteTabletRequest RPCs
as soon as they are realized by CatalogManager::ProcessPendingAssignments()
when processing the list of deferred DDL operations, and at this level there
isn't any restrictions on how many of those might be in flight or sent to
a particular tablet server (NOTE: there is --max_create_tablets_per_ts flag,
but it works on a higher level and only during initial creation of a table).

The CreateTablet requests are sent asynchronously, and if the tablet isn't
created within --tablet_creation_timeout_ms milliseconds, catalog manager
replaces all the tablet replicas, generating a new tablet UUID and sending
corresponding CreateTabletRequest RPCs to a potentially different set of tablet
servers. Corresponding DeleteTabletRequest RPCs (to remove the replicas of the
stalled-during-creation tablet) are sent separately in an asynchronous way
as well.

There are at least two issues with this approach:

The --max_create_tablets_per_ts threshold limits the number of concurrent requests hitting one tablet server only during the initial creation of a table. However, nothing limits how many requests to create a table replica might hit a tablet server when adding partitions to an existing table as a result of ALTER TABLE request.
DeleteTabletRequest RPCs sometimes might not get into the RPC queues of
corresponding tablet servers, and catalog manager stops retrying sending those
after --unresponsive_ts_rpc_timeout_ms interval. This might spiral into a situation when requests to create replacement tablet replicas are passing through and executed by tablet servers, but corresponding requests to delete tablet replica cannot get through because of queue overflows, with catalog manager eventually giving up retrying the latter ones. Eventually, tablet servers end up with huge number of tablet replicas created, and they crash running out of memory. The crashed tablet servers cannot start after that because they eventually run out of memory trying to bootstrap the huge number of tablet replicas (running out of memory again). See https://gerrit.cloudera.org/#/c/15912/ for the reproduction scenario and KUDU-2453 for corresponding issue reported some time ago.

Ideally, the catalog manager should put all the generated {Create,Delete}TabletRequests into per-tserver queues and make sure at most N requests are being processed by a tablet server at a time (the N parameter is a replacement for --max_create_tablets_per_ts in this safer approach).

Attachments

Issue Links

is related to

KUDU-2453 kudu should stop creating tablet infinitely

Reopened

Activity

People

Assignee:: Unassigned

Reporter:: Alexey Serbin

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 18/May/20 21:56

Updated:: 30/Mar/22 04:59