Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Private Beta
-
None
-
None
Description
alter_table-test occasionally fails with a timeout because of something that looks like the following race:
- TS starts up, and reports its tablet. We get to the part of HandleReportedTablet where we check the schema version, and the schema is up-to-date. However, the thread handling the tablet report hasn't yet gotten to the point of adding the tablet to the replica list of the tablet.
- client comes in with an Alter Table request. It bumps the schema version, and then iterates over all the tablets
- the tablet doesn't have any location for the replica being reported, so it doesn't send a request
- we then finish handling the tablet report and add the replica to the list for this tablet
After this, the alter table gets "stuck". It could be un-stuck by restarting the TS (to trigger a report) or probably by issuing another alter command.
Attachments
Issue Links
- relates to
-
KUDU-755 Master UI should report tablet locations even if they are not running
- Resolved