[IMPALA-10590] Ensure admissiond stays in sync with coordinators - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: Impala 4.0.0
Fix Version/s: Impala 4.0.0
Component/s: Backend
Labels:
None

Epic Color:
ghx-label-11

Description

Currently, its possible for the admission service to have an incorrect view of what resources are being used in the cluster if there are rpc failures. For example, if the ReleaseQuery rpc fails, the coordinator will retry a few times and then give up. In this case, a query has completed by the admission service doesn't know and will not allow other queries to be scheduled with those resources.

We can solve this by adding a periodic heartbeat rpc from coordinators to the admission service. This heartbeat will include the query ids for all queries currently running at each coordinator, and then the admission service can clean up resources allocated to any queries that are not in the list, on the assumption that those queries must have completed already.

Attachments

Activity

People

Assignee:: Thomas Tauber-Marshall

Reporter:: Thomas Tauber-Marshall

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 17/Mar/21 20:53

Updated:: 25/Mar/21 01:38

Resolved:: 25/Mar/21 01:38