[KUDU-1369] client does not fail over snapshot scans when querying lagging replicas - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 0.7.0
Fix Version/s: 1.2.0
Component/s: client
Labels:
None

Target Version/s:

1.2.0

Description

If the client species SCAN_AT_SNAPSHOT, and then tries to read from a replica, it's possible the replica won't have recent enough data to service the scan, or the replica may have some operations that are "stuck" started but not yet committed because the leader recently crashed. In this case, it responds with 'Timed out: could not wait for desired snapshot timestamp to be consistent: Timed out waiting for all transactions with ts < P: 1457574158715836 usec, L: 0 to commit'. However, it's possible (likely, even) that another replica does have this operation committed. The client doesn't handle this error at the moment and instead propagates it to the caller even if it could otherwise failover.

Attachments

Activity

People

Assignee:: David Alves

Reporter:: Todd Lipcon

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 10/Mar/16 01:46

Updated:: 19/Dec/16 05:39

Resolved:: 19/Dec/16 05:39