Description
Summary
When using INNER join and a join condition that is always true, the RelToSqlConverter may generate an invalid SQL.
Reproducer (Kotlin; TPCH dataset)
val frameworkConfig = ... val relationalBuilder = RelBuilder.create(frameworkConfig) val rootRelationalNode = relationalBuilder .scan("tpch", "customer") .aggregate( relationalBuilder.groupKey( relationalBuilder.field("nation_name") ), relationalBuilder.count().`as`("cnt1") ) .project(relationalBuilder.field("nation_name"), relationalBuilder.field("cnt1")) .`as`("cust") .scan("tpch", "lineitem") .aggregate( relationalBuilder.groupKey(), relationalBuilder.count().`as`("cnt2") ) .project(relationalBuilder.field("cnt2")) .`as`("lineitem") .join(JoinRelType.INNER) .scan("tpch", "part") .join( JoinRelType.LEFT, relationalBuilder.call( SqlStdOperatorTable.EQUALS, relationalBuilder.field(2, "cust", "nation_name"), relationalBuilder.field(2, "part", "p_brand") ) ) .project( relationalBuilder.field("cust", "nation_name"), relationalBuilder.alias( relationalBuilder.call( SqlStdOperatorTable.MINUS, relationalBuilder.field("cnt1"), relationalBuilder.field("cnt2") ), "f1") ) .build() println( RelToSqlConverter(SqlDialect.DatabaseProduct.POSTGRESQL.dialect) .visitRoot(rootRelationalNode) .asStatement() .toSqlString(SqlDialect.DatabaseProduct.POSTGRESQL.dialect) .sql )
Result SQL
SELECT "t"."c_name", "t"."cnt1" - "t0"."cnt2" AS "f1" FROM (SELECT "c_name", COUNT(*) AS "cnt1" FROM "tpch"."customer" GROUP BY "c_name") AS "t", (SELECT COUNT(*) AS "cnt2" FROM "tpch"."lineitem") AS "t0" LEFT JOIN "tpch"."part" ON "t"."c_name" = "part"."p_brand"
which produces error (on Postgres):
[42P01] ERROR: invalid reference to FROM-clause entry for table "t" Hint: There is an entry for table "t", but it cannot be referenced from this part of the query. Position: 265
Expected SQL
SELECT "t"."c_name", "t"."cnt1" - "t0"."cnt2" AS "f1" FROM (SELECT "c_name", COUNT(*) AS "cnt1" FROM "tpch"."customer" GROUP BY "c_name") AS "t" CROSS JOIN (SELECT COUNT(*) AS "cnt2" FROM "tpch"."lineitem") AS "t0" LEFT JOIN "tpch"."part" ON "t"."c_name" = "part"."p_brand"
Suggestion for a fix
The CROSS JOIN syntax was introduced in SQL 92 and seems to be widely supported by DB engines, therefore I suggest to change the default in SqlDialect#emulateJoinTypeForCrossJoin from JoinType.COMMA to JoinType.CROSS.
I have checked all supported DB products from SqlDialect.DatabaseProduct and they all seem to support the CROSS JOIN syntax, except these, which I have not been able to verify: Infobright, Luciddb, Paraccel, Netezza. For sake of backward compatibility, I suggest to override the emulateJoinTypeForCrossJoin method in these dialects to behave as before, i.e. JoinType.COMMA.
Attachments
Attachments
Issue Links
- is related to
-
CALCITE-2856 Emulating COMMA JOIN as CROSS JOIN for SparkSqlDialect
- Closed