Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.3.1
-
None
Description
If a master had just started up and gets a heartbeat from a tablet server that triggers some action on another tablet server (for example, tombstoning an evicted replica or adding a new replica to an under-replicated tablet config) then if the target tablet server (for example, the replica that the master thought was the leader) has not yet registered with the master (since its restart) the action will fail to be sent and will not be retried.
This is because:
- There is a logic error in the catalog manager task management code that assumes all tablet servers have registered with the master at the time a task is started; and
- These kinds of tasks are edge-triggered (based on a response to a tablet report) instead of level-triggered (based on periodic state polling) on the master side.
Attachments
Issue Links
- is related to
-
KUDU-1997 Catalog manager tasks are edge-triggered
- Resolved