[KUDU-572] Better timeout handling for Kudu clients, especially for Master requests - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: M4.5
Fix Version/s: None
Component/s: client, master
Labels:
None

Target Version/s:

M5

Description

"suppose the admin operation timeout is 10 seconds, and that is the max timeout we use. but then how do we handle the case when:
1) the server we're talking to is in an I/O Pause (this is exactly what master_failover-itest tests)
2) we want to retry and find a new master in the mean time, and yet want to keep a timeout for this.
another idea i had for this is:
1) default rpc timeout (the minimum timeout, before we retry the rpc ops)
2) overall timeout.
right now effective default_admin_operation_timeout is (1)
and select_master_timeout is (2)
this way we keep default_timeout as usual, but now have another timeout we can use to detect slow nodes.
i think on TS we rely on this by the quorum reporting the new leader to master – and i think we're changing that too.
any thoughts on this?"

See: http://gerrit.sjc.cloudera.com:8080/?l=1399#/c/5483/20/src/kudu/client/meta_cache.cc for discussion

Attachments

Activity

People

Assignee:: Adar Dembo

Reporter:: Alex Feinberg

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 11/Dec/14 03:10

Updated:: 31/Mar/15 19:23

Resolved:: 31/Mar/15 19:23