Uploaded image for project: 'Apache AsterixDB'
  1. Apache AsterixDB
  2. ASTERIXDB-3498

Misalignment between operators assigned locations

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • COMP - Compiler

    Description

      Datasets are always created on the sorted cluster locations (of the node ids). For example, if there are two nodes with ids 'a' and 'b' each having 3 partitions, then one possible cluster locations could be as follows (depending on how the nodes join the cluster):
       cluster locations = ['b', 'b', 'b', 'a', 'a', 'a']
      When a dataset is created, its locations will be the sorted cluster locations which are used by data-scan operators:
       ['a', 'a', 'a', 'b', 'b', 'b']
      This situation will cause failures for queries with operators getting assigned the cluster locations. This can happen for operators that are not connected to data-scan operators, and therefore they will use the cluster locations instead of inheriting the data-scan operators sorted locations. This misalignment results in a situation where the first activity of an operator runs in one partition like the build activity of hash-join, and the second probe activity runs in a completely different partition when those two activities should have run in the same partition.

      Attachments

        Activity

          People

            alsuliman Ali Alsuliman
            alsuliman Ali Alsuliman
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: