[KUDU-1353] Master does not rebuild replica information soft state on failover - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: 0.7.0
Fix Version/s: 0.10.0
Component/s: master
Labels:
None

Target Version/s:

1.0.0

Description

Under certain circumstances, a client's AlterTable request may get "stuck" in a quorum of masters, causing the client to time out. The flaky test dashboard usually shows a couple instances of this failure in master_failover-itest::TestRenameTableSync.

How does this happen? Let's assume the following:

Three masters
Table with one tablet, replicated three times.

It starts with a complicated master leader election in which master 1 is elected for term 1, master 2 for term 2, then master 1 steps down.
This means the three tservers have had a chance to register with both master 1 and master 2.
Now, the tablet's leader replica is elected, causing its next tablet report to include a "dirty" entry to master 2.
Master 2 is killed, master 1 is reelected for term 3.
The client issues AlterTable to master 1, but master 1 has no idea who the tablet leader is, so the AlterTablet RPC it issues fails.
At this point, there are now outstanding master->leader AlterTablet RPCs.
The tablet leader begins heartbeating to master 1. Its report is incremental and excludes registration information. Master 1 does not ask the tserver to register, because it already got that information in step 1 during the weird master leader election. The report includes the "tablet report" section but lists no tablets, so the master doesn't ask it to send a full report. No AlterTablet RPC is sent.
At this point, the alter table won't make forward progress until the tablet leader's role changes and an AlterTablet RPC is sent, which may not happen for a long time, or at all (in an integration test).

Some possible solutions:

When a tserver decides to heartbeat to a new leader master, it should always send a full tablet report.
When the master crafts an AlterTablet RPC and tries to find the leader, it currently uses "soft state" to do so, state that's only updated in the event of a tablet role change or full tablet report. Instead, we could use "hard state" which, even though the master may be newly elected, should already include up-to-date consensus configuration information.
Change tservers to always heartbeat to all masters. Doing so means that following a master leader election, the new master has up-to-date "soft state" and can find the leader when issuing an AlterTablet RPC.
Add a list of tservers to the master's "hard state", so that TSDescriptors (needed when finding the leader tablet) can be instantiated without the need for full tablet reports.

At the moment I'm not sure which approach makes the most sense, or if there's a better one out there.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

first.txt
26/Feb/16 21:25
163 kB
Adar Dembo
second.txt
26/Feb/16 21:25
140 kB
Adar Dembo

Issue Links

is related to

KUDU-759 Sketchy logic in CatalogManager::BuildLocationsForTablet()

Resolved

Activity

People

Assignee:: Adar Dembo

Reporter:: Adar Dembo

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 26/Feb/16 20:57

Updated:: 14/Jul/16 16:01

Resolved:: 10/Jun/16 01:52