[IGNITE-14588] Calcite integration. Wrong processing of nested aggregates - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: None
Component/s: sql
Labels:
- calcite

Description

The wrong plan is created when nested aggregates are used.
For example, this query:
SELECT avg(salary) FROM (SELECT avg(salary) as salary FROM employer UNION ALL SELECT salary FROM employer)
Generates such a plan:

IgniteReduceHashAggregate(group=[{}], AVG(SALARY)=[AVG($0)])
  IgniteExchange(distribution=[single])
    IgniteMapHashAggregate(group=[{}], AVG(SALARY)=[AVG($0)])
      IgniteUnionAll(all=[true])
        IgniteSingleHashAggregate(group=[{}], SALARY=[AVG($0)])
          IgniteIndexScan(table=[[PUBLIC, EMPLOYER]], index=[_key_PK], requiredColumns=[{3}])
        IgniteIndexScan(table=[[PUBLIC, EMPLOYER]], index=[_key_PK], requiredColumns=[{3}])

With this plan, in subquery data is aggregated locally on nodes and can produce the wrong results.
For example:

    @Test
    public void aggregateNested() throws Exception {
        String cacheName = "employer";

        IgniteCache<Integer, Employer> employer = client.getOrCreateCache(new CacheConfiguration<Integer, Employer>()
            .setName(cacheName)
            .setSqlSchema("PUBLIC")
            .setIndexedTypes(Integer.class, Employer.class)
            .setBackups(2)
        );

        awaitPartitionMapExchange(true, true, null);

        List<Integer> keysNode0 = primaryKeys(grid(0).cache(cacheName), 2);
        List<Integer> keysNode1 = primaryKeys(grid(1).cache(cacheName), 1);

        employer.putAll(ImmutableMap.of(
            keysNode0.get(0), new Employer("Igor", 1d),
            keysNode0.get(1), new Employer("Roman", 2d) ,
            keysNode1.get(0), new Employer("Nikolay", 3d)
        ));

        QueryEngine engine = Commons.lookupComponent(grid(1).context(), QueryEngine.class);

        List<FieldsQueryCursor<List<?>>> qry = engine.query(null, "PUBLIC",
            "SELECT avg(salary) FROM " +
                "(SELECT avg(salary) as salary FROM employer UNION ALL SELECT salary FROM employer)");

        assertEquals(1, qry.size());

        List<List<?>> rows = qry.get(0).getAll();
        assertEquals(1, rows.size());
        assertEquals(2d, F.first(F.first(rows)));
    }

With this reproducer we should get 2 as a result (avg(1, 2, 3) = 2, avg(2, 1, 2, 3) = 2), but actual result is 2.1 (avg(1, 2) = 1.5, avg (3) = 3, avg(1.5, 3, 1, 2, 3) = 2.1).

Root cause: default passThroughDistribution is not suitable for "reduce aggregate" and "single aggregate" nodes.

Attachments

Issue Links

is part of

IGNITE-12248 Apache Calcite based query execution engine

Open

links to

GitHub Pull Request #9039

Activity

People

Assignee:: Aleksey Plekhanov

Reporter:: Aleksey Plekhanov

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 19/Apr/21 07:58

Updated:: 04/May/21 06:40

Resolved:: 04/May/21 06:40

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

40m