[IGNITE-6965] affinityCall() with key mapping may not be successful with AlwaysFailoverSpi when node left - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 2.3
Fix Version/s: None
Component/s: cache, compute
Labels:
None

Description

When doing affinityCall(cacheName, key, callable) there is a race between affinity node left then stopped and AlwaysFailoverSpi max attempts reached.

Suppose the following sequence (more probable when grid2.order >> grid1.order):

1. grid1.affinityCall(cacheName, key, callable)
2. grid1: key mapped to the primary partition on grid2
3. grid2.stop()
4. grid1 receives NODE_LEFT and updates discoCache
5. grid1 execution callable failed with 'Failed to send job request because remote node left grid (if fail-over is enabled, will attempt fail-over to another node'
6. grid1: AlwaysFailoverSpi max attempts reached.
7. grid1.affinityCall failed with 'Job failover failed because number of maximum failover attempts for affinity call is exceeded'
8. grid2 receives verified node left message then stopping.

The patched CacheAffinityCallSelfTest reproduces the problem.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

IGNITE_6965_affinityCall_with_key_mapping_AlwaysFailoverSpi_node_left.patch
20/Nov/17 21:26
6 kB
Alexandr Kuramshin

Activity

People

Assignee:: Unassigned

Reporter:: Alexandr Kuramshin

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 20/Nov/17 20:13

Updated:: 14/Nov/19 09:20