Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.6.1.0
    • Fix Version/s: 10.6.1.0
    • Component/s: SQL
    • Labels:
      None
    • Environment:
      any
    • Issue & fix info:
      Release Note Needed

      Description

      Implement the CROSS JOIN syntax

      SELECT * from t1 CROSS JOIN t2;

      as an alternative syntax to

      SELECT * FROM t1, t2;

      This should be pretty straight forward and ease the migration of SQL code to Derby.

      1. cross_v2.diff
        13 kB
        Knut Anders Hatlen
      2. cross_v3.diff
        21 kB
        Knut Anders Hatlen
      3. cross_v3.stat
        0.3 kB
        Knut Anders Hatlen
      4. cross.diff
        5 kB
        Knut Anders Hatlen
      5. releaseNote.html
        4 kB
        Knut Anders Hatlen

        Issue Links

          Activity

          Hide
          Knut Anders Hatlen added a comment -

          For the record, CROSS JOIN is feature F401-04 in SQL:2003. http://wiki.apache.org/db-derby/SQLvsDerbyFeatures

          Show
          Knut Anders Hatlen added a comment - For the record, CROSS JOIN is feature F401-04 in SQL:2003. http://wiki.apache.org/db-derby/SQLvsDerbyFeatures
          Hide
          Knut Anders Hatlen added a comment -

          Here's an untested patch that adds support for CROSS JOIN in the parser, and that seems to be enough to get it working.

          ij> create table t1(x int);
          0 rows inserted/updated/deleted
          ij> insert into t1 values 1,2;
          2 rows inserted/updated/deleted
          ij> create table t2(x char(1));
          0 rows inserted/updated/deleted
          ij> insert into t2 values 'a','b';
          2 rows inserted/updated/deleted
          ij> select * from t1 cross join t2;
          X |X
          ----------------
          1 |a
          1 |b
          2 |a
          2 |b

          4 rows selected

          Show
          Knut Anders Hatlen added a comment - Here's an untested patch that adds support for CROSS JOIN in the parser, and that seems to be enough to get it working. ij> create table t1(x int); 0 rows inserted/updated/deleted ij> insert into t1 values 1,2; 2 rows inserted/updated/deleted ij> create table t2(x char(1)); 0 rows inserted/updated/deleted ij> insert into t2 values 'a','b'; 2 rows inserted/updated/deleted ij> select * from t1 cross join t2; X |X ---------------- 1 |a 1 |b 2 |a 2 |b 4 rows selected
          Hide
          Dag H. Wanvik added a comment -

          Very cool! Thanks for this finding, Knut! Hope the tests prove that we can add this feature to our list!

          Nit:

          • sqlgrammar.jj
            The argument:
            "onClause, // should be null"

          I think I'd prefer to do it the other way around, i.e.
          "null, // onClause not used for cross join"

          • messages.xml:
            :
            <name>42Y14</name>
            <text>A join specification is not allowed with the ' {0}

            ' clause.</text>

          Why not say "Cross join" rather than "a join"? Do you foresee that the error message will be used for other kind of joins later?

          Show
          Dag H. Wanvik added a comment - Very cool! Thanks for this finding, Knut! Hope the tests prove that we can add this feature to our list! Nit: sqlgrammar.jj The argument: "onClause, // should be null" I think I'd prefer to do it the other way around, i.e. "null, // onClause not used for cross join" messages.xml: : <name>42Y14</name> <text>A join specification is not allowed with the ' {0} ' clause.</text> Why not say "Cross join" rather than "a join"? Do you foresee that the error message will be used for other kind of joins later?
          Hide
          Knut Anders Hatlen added a comment -

          Thanks Dag. I'll make the suggested change to sqlgrammar.jj. We have already checked that onClause and usingClause are null at that point, so it should be safe both ways.

          As to the parameterized message, I think join specifications (ON/USING) are not allowed for natural joins either, so the message may be reused if/when we implement natural join.

          Show
          Knut Anders Hatlen added a comment - Thanks Dag. I'll make the suggested change to sqlgrammar.jj. We have already checked that onClause and usingClause are null at that point, so it should be safe both ways. As to the parameterized message, I think join specifications (ON/USING) are not allowed for natural joins either, so the message may be reused if/when we implement natural join.
          Hide
          Knut Anders Hatlen added a comment -

          For the new message (42Y14) I basically followed the pattern for 42Y11, which is the message for missing join specification used by inner join and left/right outer join. One thing to notice with the existing message, is that without the patch, the join specification is required in the grammar, so skipping the ON clause in one of the currently supported join types will never cause an exception with the specialized error message (42Y11) to be thrown. Instead, a cryptic EOF error will be raised by the parser if the ON clause is missing. Now that the patch has made the join specification optional in the grammar, one unintended side effect is that the EOF error is not raised by the parser, and the clearer specialized message is shown instead.

          So currently on trunk, we have this:

          ij> select * from t1 join t2;
          ERROR 42X01: Syntax error: Encountered "<EOF>" at line 1, column 24.
          Issue the 'help' command for general information on IJ command syntax.
          Any unrecognized commands are treated as potential SQL commands and executed directly.
          Consult your DBMS server reference documentation for details of the SQL syntax supported by your server.

          With the patch, we get this:

          ij> select * from t1 join t2;
          ERROR 42Y11: A join specification is required with the 'INNER JOIN' clause.

          This will certainly cause failures in the regression tests because the canons expect the EOF message for negative join tests, but it looks like an improvement to me.

          Show
          Knut Anders Hatlen added a comment - For the new message (42Y14) I basically followed the pattern for 42Y11, which is the message for missing join specification used by inner join and left/right outer join. One thing to notice with the existing message, is that without the patch, the join specification is required in the grammar, so skipping the ON clause in one of the currently supported join types will never cause an exception with the specialized error message (42Y11) to be thrown. Instead, a cryptic EOF error will be raised by the parser if the ON clause is missing. Now that the patch has made the join specification optional in the grammar, one unintended side effect is that the EOF error is not raised by the parser, and the clearer specialized message is shown instead. So currently on trunk, we have this: ij> select * from t1 join t2; ERROR 42X01: Syntax error: Encountered "<EOF>" at line 1, column 24. Issue the 'help' command for general information on IJ command syntax. Any unrecognized commands are treated as potential SQL commands and executed directly. Consult your DBMS server reference documentation for details of the SQL syntax supported by your server. With the patch, we get this: ij> select * from t1 join t2; ERROR 42Y11: A join specification is required with the 'INNER JOIN' clause. This will certainly cause failures in the regression tests because the canons expect the EOF message for negative join tests, but it looks like an improvement to me.
          Hide
          Dag H. Wanvik added a comment -

          > This will certainly cause failures in the regression tests because
          > the canons expect the EOF message for negative join tests, but it
          > looks like an improvement to me.

          +1

          A newbie might wonder why the message says INNER when that keyword was
          not used, perhaps (your example). But that info is lost in the parser, so it's ok.

          Show
          Dag H. Wanvik added a comment - > This will certainly cause failures in the regression tests because > the canons expect the EOF message for negative join tests, but it > looks like an improvement to me. +1 A newbie might wonder why the message says INNER when that keyword was not used, perhaps (your example). But that info is lost in the parser, so it's ok.
          Hide
          Bernt M. Johnsen added a comment -

          Well, it is an INNER JOIN, although the INNER keyword is optional. Any user fluent enough in SQL to write an inner join without the optional keyword and with the following ON or USING clause should know that.

          Show
          Bernt M. Johnsen added a comment - Well, it is an INNER JOIN, although the INNER keyword is optional. Any user fluent enough in SQL to write an inner join without the optional keyword and with the following ON or USING clause should know that.
          Hide
          Knut Anders Hatlen added a comment -

          The patch has two possible compatibility issues that warrant a release note:

          1) It will make CROSS a reserved keyword, so existing (non-portable) code that uses CROSS as an identifier would need to quote it.

          2) Making the join specification optional changes the SQLState for some syntax errors.

          I've created a separate issue for (2), DERBY-4369, so that we get two separate release notes.

          Show
          Knut Anders Hatlen added a comment - The patch has two possible compatibility issues that warrant a release note: 1) It will make CROSS a reserved keyword, so existing (non-portable) code that uses CROSS as an identifier would need to quote it. 2) Making the join specification optional changes the SQLState for some syntax errors. I've created a separate issue for (2), DERBY-4369 , so that we get two separate release notes.
          Hide
          Knut Anders Hatlen added a comment -

          Attaching cross_v2.diff with updates to make it apply after DERBY-4369. It also adds a test cases to JoinTest. I'm assuming that the underlying join machinery is tested by the existing tests, so the new test cases focus on checking that the syntax is accepted. They do check that the results are correct, but they don't do all sorts of combinations with different data types, nullability, indexes and so on.

          One problem that I found, was that this query is not accepted (it's commented out in the test for now):

          select * from t1 a inner join t2 b cross join t2 c on 1=1

          This results in an SQLSyntaxErrorException: A join specification is not allowed with the 'CROSS JOIN' clause.

          It works if parentheses are added, like this:

          select * from t1 a inner join (t2 b cross join t2 c) on 1=1

          However, I do believe that it should be accepted without parentheses too. The problem is that the patch adds CROSS JOIN in the parser's qualifiedJoin rule, where an ON clause right after any kind of join will be consumed. The parser therefore incorrectly believes that the ON clause is part of the cross join and raises an error. Splitting out a separate rule for cross join (which is also how it is defined in the SQL standard) will prevent the ON clause from being seen when parsing the the cross join, and it will be seen as part of the inner join instead. I will post an updated patch with this issue fixed.

          Show
          Knut Anders Hatlen added a comment - Attaching cross_v2.diff with updates to make it apply after DERBY-4369 . It also adds a test cases to JoinTest. I'm assuming that the underlying join machinery is tested by the existing tests, so the new test cases focus on checking that the syntax is accepted. They do check that the results are correct, but they don't do all sorts of combinations with different data types, nullability, indexes and so on. One problem that I found, was that this query is not accepted (it's commented out in the test for now): select * from t1 a inner join t2 b cross join t2 c on 1=1 This results in an SQLSyntaxErrorException: A join specification is not allowed with the 'CROSS JOIN' clause. It works if parentheses are added, like this: select * from t1 a inner join (t2 b cross join t2 c) on 1=1 However, I do believe that it should be accepted without parentheses too. The problem is that the patch adds CROSS JOIN in the parser's qualifiedJoin rule, where an ON clause right after any kind of join will be consumed. The parser therefore incorrectly believes that the ON clause is part of the cross join and raises an error. Splitting out a separate rule for cross join (which is also how it is defined in the SQL standard) will prevent the ON clause from being seen when parsing the the cross join, and it will be seen as part of the inner join instead. I will post an updated patch with this issue fixed.
          Hide
          Knut Anders Hatlen added a comment -

          Another problem with the current patch:

          In the standard, the syntax for cross (and natural) join is defined with a <table reference> on the left side of the join operator and a <table factor> on the right side. This asymmetry is to get the correct join order without parentheses. The other joins are defined with a <table reference> on both sides (their join order will be unambiguous thanks to the ON/USING clauses). Since the current patch wired CROSS into the syntax rule for the other joins, it has <table reference> on both sides too.

          This leads to the following query

          select * from t1 cross join t2 right join t3 on x2=x3

          being parsed as if it said

          select * from t1 cross join (t2 right join t3 on x2=x3)

          whereas it should have been parsed as

          select * from (t1 cross join t2) right join t3 on x2=x3

          This makes the query return wrong results. Given the tables defined below:

          create table t1(x1 int);
          create table t2(x2 int);
          create table t3(x3 int);

          insert into t1 values (1);
          insert into t2 values (2);
          insert into t3 values (3);

          We see that the two supposedly equivalent queries return different results with the patch:

          ij> select * from t1 cross join t2 right join t3 on x2=x3;
          X1 |X2 |X3
          -----------------------------------
          1 |NULL |3

          1 row selected
          ij> select * from (t1 cross join t2) right join t3 on x2=x3;
          X1 |X2 |X3
          -----------------------------------
          NULL |NULL |3

          1 row selected

          I'll add this as a test case in the final patch.

          Show
          Knut Anders Hatlen added a comment - Another problem with the current patch: In the standard, the syntax for cross (and natural) join is defined with a <table reference> on the left side of the join operator and a <table factor> on the right side. This asymmetry is to get the correct join order without parentheses. The other joins are defined with a <table reference> on both sides (their join order will be unambiguous thanks to the ON/USING clauses). Since the current patch wired CROSS into the syntax rule for the other joins, it has <table reference> on both sides too. This leads to the following query select * from t1 cross join t2 right join t3 on x2=x3 being parsed as if it said select * from t1 cross join (t2 right join t3 on x2=x3) whereas it should have been parsed as select * from (t1 cross join t2) right join t3 on x2=x3 This makes the query return wrong results. Given the tables defined below: create table t1(x1 int); create table t2(x2 int); create table t3(x3 int); insert into t1 values (1); insert into t2 values (2); insert into t3 values (3); We see that the two supposedly equivalent queries return different results with the patch: ij> select * from t1 cross join t2 right join t3 on x2=x3; X1 |X2 |X3 ----------------------------------- 1 |NULL |3 1 row selected ij> select * from (t1 cross join t2) right join t3 on x2=x3; X1 |X2 |X3 ----------------------------------- NULL |NULL |3 1 row selected I'll add this as a test case in the final patch.
          Hide
          Knut Anders Hatlen added a comment -

          Here's an updated patch (v3) which addresses the issues mentioned in the previous comments.

          Changes from the previous revision of the patch:

          • The right side of a cross join is now a <table factor> and not a <table reference>. The current grammar doesn't separate between <table reference> and <table factor>, so the patch splits out some of the code from the current <table reference> rule into a new <table factor> rule. This split also made some duplicated code go away, so the patch actually removes more code from the parser than it adds.
          • Added test cases for the queries the previous patch didn't handle correctly.
          • Removed a test case from lang/db2Compatibility.sql which verified that a simple cross join failed with syntax error.

          All the regression tests ran cleanly. The patch is ready for review.

          Show
          Knut Anders Hatlen added a comment - Here's an updated patch (v3) which addresses the issues mentioned in the previous comments. Changes from the previous revision of the patch: The right side of a cross join is now a <table factor> and not a <table reference>. The current grammar doesn't separate between <table reference> and <table factor>, so the patch splits out some of the code from the current <table reference> rule into a new <table factor> rule. This split also made some duplicated code go away, so the patch actually removes more code from the parser than it adds. Added test cases for the queries the previous patch didn't handle correctly. Removed a test case from lang/db2Compatibility.sql which verified that a simple cross join failed with syntax error. All the regression tests ran cleanly. The patch is ready for review.
          Hide
          Bryan Pendleton added a comment -

          Seems like a nice feature addition, with a clean implementation, and lots of good new added tests.

          I couldn't think of any additional tests to add; thanks for taking the time to put such a
          thorough set of tests together!

          +1 to commit from me.

          Show
          Bryan Pendleton added a comment - Seems like a nice feature addition, with a clean implementation, and lots of good new added tests. I couldn't think of any additional tests to add; thanks for taking the time to put such a thorough set of tests together! +1 to commit from me.
          Hide
          Knut Anders Hatlen added a comment -

          Thanks for looking at the patch, Bryan!

          I've committed it to trunk with revision 813293, and I will post a suggestion for a release note shortly.

          Show
          Knut Anders Hatlen added a comment - Thanks for looking at the patch, Bryan! I've committed it to trunk with revision 813293, and I will post a suggestion for a release note shortly.
          Hide
          Knut Anders Hatlen added a comment -

          Attaching a release note about the new reserved keyword.

          Show
          Knut Anders Hatlen added a comment - Attaching a release note about the new reserved keyword.

            People

            • Assignee:
              Knut Anders Hatlen
              Reporter:
              Bernt M. Johnsen
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development