[ASTERIXDB-3498] Misalignment between operators assigned locations - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: COMP - Compiler
Labels:
- triaged

Description

Datasets are always created on the sorted cluster locations (of the node ids). For example, if there are two nodes with ids 'a' and 'b' each having 3 partitions, then one possible cluster locations could be as follows (depending on how the nodes join the cluster):
cluster locations = ['b', 'b', 'b', 'a', 'a', 'a']
When a dataset is created, its locations will be the sorted cluster locations which are used by data-scan operators:
['a', 'a', 'a', 'b', 'b', 'b']
This situation will cause failures for queries with operators getting assigned the cluster locations. This can happen for operators that are not connected to data-scan operators, and therefore they will use the cluster locations instead of inheriting the data-scan operators sorted locations. This misalignment results in a situation where the first activity of an operator runs in one partition like the build activity of hash-join, and the second probe activity runs in a completely different partition when those two activities should have run in the same partition.

Attachments

Activity

People

Assignee:: Ali Alsuliman

Reporter:: Ali Alsuliman

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 05/Sep/24 05:45

Updated:: 14/Sep/24 00:26