Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 10.1.3.3, 10.2.2.0, 10.3.3.1, 10.4.2.1, 10.5.3.1, 10.6.1.0
    • Fix Version/s: None
    • Component/s: SQL
    • Urgency:
      Normal
    • Issue & fix info:
      Repro attached

      Description

      The following query shows the error using the Derby demo toursDB:
      SELECT FLIGHT_ID , SEGMENT_NUMBER , ARRIVE_TIME
      FROM (
      (SELECT HW.FLIGHT_ID , HW.SEGMENT_NUMBER , HW.ARRIVE_TIME FROM FLIGHTS HW)
      UNION
      (SELECT SW.FLIGHT_ID , SW.SEGMENT_NUMBER , SW.ARRIVE_TIME FROM FLIGHTS SW)
      ) SRC;
      ERROR 42X01: Syntax error: Encountered "UNION" at line 4, column 12.

      The following query works:
      SELECT FLIGHT_ID , SEGMENT_NUMBER , ARRIVE_TIME
      FROM (
      SELECT HW.FLIGHT_ID , HW.SEGMENT_NUMBER , HW.ARRIVE_TIME FROM FLIGHTS HW
      UNION
      SELECT SW.FLIGHT_ID , SW.SEGMENT_NUMBER , SW.ARRIVE_TIME FROM FLIGHTS SW
      ) SRC;

        Activity

        Hide
        Bryan Pendleton added a comment -

        Interestingly, the parentheses are OK if placed on the right side
        of the union sub-select, but not if placed on the left. That is, this query works:

        SELECT FLIGHT_ID , SEGMENT_NUMBER , ARRIVE_TIME FROM (
        SELECT HW.FLIGHT_ID ,HW.SEGMENT_NUMBER , HW.ARRIVE_TIME FROM FLIGHTS HW
        union
        (SELECT SW.FLIGHT_ID ,SW.SEGMENT_NUMBER , SW.ARRIVE_TIME FROM FLIGHTS SW)
        ) SRC;

        I wonder if the issue is related to this comment in sqlgrammar.jj:

        /*

        • <A NAME="queryExpression">queryExpression</A>
          *
        • We have to be carefull to get the associativity correct. According to the SQL
          spec
        • <non-join query expression> ::=
        • <non-join query term>
        • <query expression body> UNION [ ALL ] <query term>
        • <query expression body> EXCEPT [ ALL ] <query term>
        • Meaning that
        • t1 UNION ALL t2 UNION t3
        • is equivalent to
        • (t1 UNION ALL t2) UNION t3
        • However recursive descent parsers want recursion to be on the right, so this
          kind of associativity is unnatural
        • for our parser. The queryExpression method must know whether it is being call
          ed as the right hand side of a
        • set operator to produce a query tree with the correct associativity.
          */

        Here's the relevant snip from the SQL 92 BNF grammar.

        7.10 <query expression>

        Function

        Specify a table.

        Format

        <query expression> ::=
        <non-join query expression>

        <joined table>

        <non-join query expression> ::=
        <non-join query term>

        <query expression> UNION [ ALL ] [ <corresponding spec> ] <query term>
        <query expression> EXCEPT [ ALL ] [ <corresponding spec> ] <query term>

        <query term> ::=
        <non-join query term>

        <joined table>

        <non-join query term> ::=
        <non-join query primary>

        <query term> INTERSECT [ ALL ] [ <corresponding spec> ] <query primary>

        <query primary> ::=
        <non-join query primary>

        <joined table>

        <non-join query primary> ::=
        <simple table>

        <left paren> <non-join query expression> <right paren>

        <simple table> ::=
        <query specification>

        <table value constructor>
        <explicit table>

        That's as far as I went with this, as my skills with diagnosing
        generated parsers kind of hit a roadblock.

        Show
        Bryan Pendleton added a comment - Interestingly, the parentheses are OK if placed on the right side of the union sub-select, but not if placed on the left. That is, this query works: SELECT FLIGHT_ID , SEGMENT_NUMBER , ARRIVE_TIME FROM ( SELECT HW.FLIGHT_ID ,HW.SEGMENT_NUMBER , HW.ARRIVE_TIME FROM FLIGHTS HW union (SELECT SW.FLIGHT_ID ,SW.SEGMENT_NUMBER , SW.ARRIVE_TIME FROM FLIGHTS SW) ) SRC; I wonder if the issue is related to this comment in sqlgrammar.jj: /* <A NAME="queryExpression">queryExpression</A> * We have to be carefull to get the associativity correct. According to the SQL spec <non-join query expression> ::= <non-join query term> <query expression body> UNION [ ALL ] <query term> <query expression body> EXCEPT [ ALL ] <query term> Meaning that t1 UNION ALL t2 UNION t3 is equivalent to (t1 UNION ALL t2) UNION t3 However recursive descent parsers want recursion to be on the right, so this kind of associativity is unnatural for our parser. The queryExpression method must know whether it is being call ed as the right hand side of a set operator to produce a query tree with the correct associativity. */ Here's the relevant snip from the SQL 92 BNF grammar. 7.10 <query expression> Function Specify a table. Format <query expression> ::= <non-join query expression> <joined table> <non-join query expression> ::= <non-join query term> <query expression> UNION [ ALL ] [ <corresponding spec> ] <query term> <query expression> EXCEPT [ ALL ] [ <corresponding spec> ] <query term> <query term> ::= <non-join query term> <joined table> <non-join query term> ::= <non-join query primary> <query term> INTERSECT [ ALL ] [ <corresponding spec> ] <query primary> <query primary> ::= <non-join query primary> <joined table> <non-join query primary> ::= <simple table> <left paren> <non-join query expression> <right paren> <simple table> ::= <query specification> <table value constructor> <explicit table> That's as far as I went with this, as my skills with diagnosing generated parsers kind of hit a roadblock.
        Hide
        Rick Hillegas added a comment -

        Triaged for 10.5.2: noted that repro is available.

        Show
        Rick Hillegas added a comment - Triaged for 10.5.2: noted that repro is available.
        Hide
        Mamta A. Satoor added a comment -

        I tried the following simple script on trunk and got the error reported in this jira.
        connect 'jdbc:derby:c:/dellater/db;create=true';
        create table t1 (c1 int);
        create table t2 (c1 int);
        select c1 from ((select c1 from t1) union (select c1 from t2)) t1t2;

        BTW, the script above also fails on 10.1 so it seems like Derby never allowed this SQL-standard allowed syntax.

        I am just collecting some basic info on this and thought would share them.

        Show
        Mamta A. Satoor added a comment - I tried the following simple script on trunk and got the error reported in this jira. connect 'jdbc:derby:c:/dellater/db;create=true'; create table t1 (c1 int); create table t2 (c1 int); select c1 from ((select c1 from t1) union (select c1 from t2)) t1t2; BTW, the script above also fails on 10.1 so it seems like Derby never allowed this SQL-standard allowed syntax. I am just collecting some basic info on this and thought would share them.
        Hide
        Dag H. Wanvik added a comment -

        Just to make sure it is allowed, I did the derivation as an exercise

        Quote from std (2003):

        <query expression body> ::=
        <query term>

        <query expression body> UNION [ ALL DISTINCT ]
        [ <corresponding spec> ] <query term>
        <query expression body> EXCEPT [ ALL DISTINCT ]
        [ <corresponding spec> ] <query term>

        <query term> ::=
        <query primary>

        <query term> INTERSECT [ ALL DISTINCT ]
        [ <corresponding spec> ] <query primary>

        <query primary> ::=
        <simple table>

        <left paren> <query expression body> <right paren>

        <simple table> ::=
        <query specification>

        <table value constructor>
        <explicit table>

        <query specification> ::=
        SELECT [ <set quantifier> ] <select list> <table expression>

        Example:
        SELECT c1 FROM ((SELECT c1 FROM t1) UNION (SELECT c1 FROM t2)) t1t2;

        Derivation:

        <query expression body>
        <query expression body> UNION ...
        <query term> UNION ...
        <query primary> UNION ...
        <left paren> <query expression body> <right paren> UNION ...
        ( <query expression body> ) UNION ...
        ( <query term> ) UNION ...
        ( <query primary> ) UNION ...
        ( <simple table> ) UNION ...
        ( <query specification> ) UNION ...
        ( SELECT c1 FROM t1 ) UNION ...

        Similarly for the right hand side of the union.

        Show
        Dag H. Wanvik added a comment - Just to make sure it is allowed, I did the derivation as an exercise Quote from std (2003): <query expression body> ::= <query term> <query expression body> UNION [ ALL DISTINCT ] [ <corresponding spec> ] <query term> <query expression body> EXCEPT [ ALL DISTINCT ] [ <corresponding spec> ] <query term> <query term> ::= <query primary> <query term> INTERSECT [ ALL DISTINCT ] [ <corresponding spec> ] <query primary> <query primary> ::= <simple table> <left paren> <query expression body> <right paren> <simple table> ::= <query specification> <table value constructor> <explicit table> <query specification> ::= SELECT [ <set quantifier> ] <select list> <table expression> Example: SELECT c1 FROM ((SELECT c1 FROM t1) UNION (SELECT c1 FROM t2)) t1t2; Derivation: <query expression body> <query expression body> UNION ... <query term> UNION ... <query primary> UNION ... <left paren> <query expression body> <right paren> UNION ... ( <query expression body> ) UNION ... ( <query term> ) UNION ... ( <query primary> ) UNION ... ( <simple table> ) UNION ... ( <query specification> ) UNION ... ( SELECT c1 FROM t1 ) UNION ... Similarly for the right hand side of the union.
        Hide
        Knut Anders Hatlen added a comment -

        Also interesting is that INTERSECT does not have the same associativity as UNION and EXCEPT. I wonder if we have implemented that correctly?

        Show
        Knut Anders Hatlen added a comment - Also interesting is that INTERSECT does not have the same associativity as UNION and EXCEPT. I wonder if we have implemented that correctly?
        Hide
        Mamta A. Satoor added a comment -

        Dag, thanks for providing how the query in question is SQL-supported by walking through different derivatives. I had gone through the spec myself yesterday to make sure that what this jira wants is indeed SQL-compatible,

        Knut, it is interesting that INTERSECT does not work same as EXCEPT associativity wise. I thought the work to support both of them in Derby went in as one big checkin and so I assumed that parser behavior would be the same for both of them.

        Show
        Mamta A. Satoor added a comment - Dag, thanks for providing how the query in question is SQL-supported by walking through different derivatives. I had gone through the spec myself yesterday to make sure that what this jira wants is indeed SQL-compatible, Knut, it is interesting that INTERSECT does not work same as EXCEPT associativity wise. I thought the work to support both of them in Derby went in as one big checkin and so I assumed that parser behavior would be the same for both of them.
        Hide
        Mamta A. Satoor added a comment -

        Reread your comment Knut. And I think what you were saying is that SQL spec has different associativity for INTERSECT compared to EXCEPT and UNION and you are wondering if Derby's implementation is enforcing that difference in associativity.

        Show
        Mamta A. Satoor added a comment - Reread your comment Knut. And I think what you were saying is that SQL spec has different associativity for INTERSECT compared to EXCEPT and UNION and you are wondering if Derby's implementation is enforcing that difference in associativity.
        Hide
        Knut Anders Hatlen added a comment -

        Mamta: Yes, that's what I was trying to say. But looking at the grammar in the standard again, I think the associativity is the same (they are all left-associative), it's the operator precedence that's different. INTERSECT has higher precedence than UNION and EXCEPT. And it looks like Derby's grammar does have a separate rule for INTERSECT, so I guess it's probably fine.

        Show
        Knut Anders Hatlen added a comment - Mamta: Yes, that's what I was trying to say. But looking at the grammar in the standard again, I think the associativity is the same (they are all left-associative), it's the operator precedence that's different. INTERSECT has higher precedence than UNION and EXCEPT. And it looks like Derby's grammar does have a separate rule for INTERSECT, so I guess it's probably fine.
        Hide
        Mamta A. Satoor added a comment -

        I am looking at this more, but the root of the problem might be the following piece of code in sqlgrammar.jj in FromTable tableFactor() throws StandardException method

        // There is a grammar ambiguity with nested parentheses here.
        // A series of left parentheses could introduce either a table
        // reference or a derived table. For example:
        //
        // (((select c from t) a inner join (select d from s) b ))
        //
        // and:
        //
        // (((select c from t) a)))
        //
        // To distinguish these two cases, we consider anything that starts
        // with a single parenthesis and either SELECT or VALUES to be
        // a derived table, and anything else to be a table reference.
        // Note that we can't use the subqueryFollows() lookahead method,
        // because it skips over all leading left parentheses to decide
        // whether a subquery follows.
        LOOKAHEAD(

        { getToken(1).kind == LEFT_PAREN && ( getToken(2).kind == SELECT || getToken(2).kind == VALUES ) }

        )

        I will debug more but it looks like we are not differentiating correctly the first select inside the parenthese to mean <left paren> <query expression body> <right paren>
        select c1 from ((select t1.c1 from t1) union (select t2.c1 from t2)) t1t2;

        Show
        Mamta A. Satoor added a comment - I am looking at this more, but the root of the problem might be the following piece of code in sqlgrammar.jj in FromTable tableFactor() throws StandardException method // There is a grammar ambiguity with nested parentheses here. // A series of left parentheses could introduce either a table // reference or a derived table. For example: // // (((select c from t) a inner join (select d from s) b )) // // and: // // (((select c from t) a))) // // To distinguish these two cases, we consider anything that starts // with a single parenthesis and either SELECT or VALUES to be // a derived table, and anything else to be a table reference. // Note that we can't use the subqueryFollows() lookahead method, // because it skips over all leading left parentheses to decide // whether a subquery follows. LOOKAHEAD( { getToken(1).kind == LEFT_PAREN && ( getToken(2).kind == SELECT || getToken(2).kind == VALUES ) } ) I will debug more but it looks like we are not differentiating correctly the first select inside the parenthese to mean <left paren> <query expression body> <right paren> select c1 from ((select t1.c1 from t1) union (select t2.c1 from t2)) t1t2;
        Hide
        Dag H. Wanvik added a comment - - edited

        Could this is a heuristic made necessary by the LL nature of the
        parser (recursive descent)? I think the parser can't really know
        whether it's seeing seeing a "( <table reference ... " or a "( <query
        expression body.." without looking past the matching ")" +
        correlation and seeing a join operator or not.

        • If it does see a join operator there, it should descend into <table
          reference>)
        • If is does not see a join operator there, it should descend into
          <query expression body>, i.e. a subquery, which it failed to do here.

        I suspect this long back-ahead + paren matching may be beyond JavaCC..

        So, the authors of this code found a workable heuristic, which fails in
        this case Maybe there is a way to finesse it, though..

        Show
        Dag H. Wanvik added a comment - - edited Could this is a heuristic made necessary by the LL nature of the parser (recursive descent)? I think the parser can't really know whether it's seeing seeing a "( <table reference ... " or a "( <query expression body.." without looking past the matching ")" + correlation and seeing a join operator or not. If it does see a join operator there, it should descend into <table reference>) If is does not see a join operator there, it should descend into <query expression body>, i.e. a subquery, which it failed to do here. I suspect this long back-ahead + paren matching may be beyond JavaCC.. So, the authors of this code found a workable heuristic, which fails in this case Maybe there is a way to finesse it, though..
        Hide
        Mamta A. Satoor added a comment -

        Dag, your obesvation about the nature of the parser sound correct.

        I am spending some time going through the grammar rules of SQL-spec versus rules implemented by Derby to see where the discrepency happens. I hope to have that comparison flow chart by today. Would appreciate if you and others can look at my interpretation of Derby's implementation once I post it. Thanks

        Show
        Mamta A. Satoor added a comment - Dag, your obesvation about the nature of the parser sound correct. I am spending some time going through the grammar rules of SQL-spec versus rules implemented by Derby to see where the discrepency happens. I hope to have that comparison flow chart by today. Would appreciate if you and others can look at my interpretation of Derby's implementation once I post it. Thanks
        Hide
        Mamta A. Satoor added a comment -

        I have included selective part of the grammar from Derby and SQL standard to try to show how the implementation is different than the spec and probably the cause behind the behavior. The example query is as below
        select c1 from ((select c1 from t1) union (select c1 from t2)) t1t2;

        I will work with the following part of the query above since the handling of first 2 tokens "select c1" seems same for Derby and SQL standard
        from ((select c1 from t1) union (select c1 from t2)) t1t2;

        Derby implementation
        1)Start with
        from ((select c1 from t1) union (select c1 from t2)) t1t2;
        2)Following rule consumes token from and the remaining string is handled by <table reference>
        <from clause> ::= FROM <table reference>
        Before <table reference> takes over, the string looks as follows
        ((select c1 from t1) union (select c1 from t2)) t1t2;
        3)Following rules will be used next
        <table reference> ::= <table factor>
        <table factor> ::= <left paren> <table reference> <right paren>
        Before going into the rule about <table reference> and after <left paren> consumption, our string will look as follows
        (select c1 from t1) union (select c1 from t2)) t1t2;
        The processing done by <table reference> <right paren> is covered in 4)
        4)Following rules will be used next
        <table reference> ::= <table factor>
        <table factor> ::= <derived table>
        <derived table> ::= <left paren> <table subquery> <right paren>
        Before going into the rule about <table subquery> and after <left paren> consumption, our string will look as follows
        select c1 from t1) union (select c1 from t2)) t1t2;
        5)Next comes rules decided by <table subquery> <right paren>
        <table subquery> ::= <queryExpression>
        <queryExpression> ::= SELECT <column list> FROM <table reference> [UNION]******
        ******Very important to note that optional UNION token is checked by <queryExpression> and I think this is what is wrong. We either need some other place also to look for UNION or this [UNION] check should be moved out from here somewhere up. UNION or this [UNION] check should be moved out from here somewhere up. When <queryExpression> gets to optional UNION token check, our string will look as follows
        ) union (select c1 from t2)) t1t2;
        5)Since the next token is not UNION, rather it is ), we get out of <queryExpression>, <table sunqery> and come to the <right paren> part of <derived table> as shown below
        <derived table> ::= <left paren> <table subquery> <right paren>
        After the <right paren> processing, our string looks as follows
        union (select c1 from t2)) t1t2;
        Notice now that the next token is union and we are going up the stack of rules, out from <derived table>, <table factor>, <table reference> and onto <table factor> again
        <table factor> ::= <left paren> <table reference> <right paren>
        Here, we are looking for <right paren> but what we have in the string is UNION and this is when the parser throws exception for unexpected UNION as per it's rules

        SQL spec
        For the same query above, this is how SQL spec rules kick in
        1)Start with
        from ((select c1 from t1) union (select c1 from t2)) t1t2;
        <from clause> ::= <FROM> <table reference>
        After the <FROM> consumption, our string will look as follows
        ((select c1 from t1) union (select c1 from t2)) t1t2;
        2)Following 4 rules kick in
        <table reference> ::= <table factor>
        <table factor> ::= <derived table>
        <derived table> ::= <table subquery>
        <table subquery> ::= <left paren> <query expression> <right paren>
        At the end of <left paren> consumption, our original sql will now look like following
        (select c1 from t1) union (select c1 from t2)) t1t2;
        3)First the <query expression body> of the rule below will kick in
        <query expression> ::= <query expression body> UNION <query term>
        <query expression body> ::= <query term>
        <query term> ::= <left paren> <query expression body> <right paren>
        After <left paren> from <query term>, now we have
        select c1 from t1) union (select c1 from t2)) t1t2;
        4)Once the <query term> below is finished, we will have
        union (select c1 from t2)) t1t2;
        5)Now we move up the stack of the rules fired so far and when we get back to <query expression> ::= <query expression body> UNION <query term>, we are done with <query expression body> and we will consume UNION from original sql as follows
        (select c1 from t2)) t1t2;
        So, as we see from the SQL spec rules, it has proper rules in place to handle parentheses around the left term of UNION which Derby rules do not support at this point.

        Show
        Mamta A. Satoor added a comment - I have included selective part of the grammar from Derby and SQL standard to try to show how the implementation is different than the spec and probably the cause behind the behavior. The example query is as below select c1 from ((select c1 from t1) union (select c1 from t2)) t1t2; I will work with the following part of the query above since the handling of first 2 tokens "select c1" seems same for Derby and SQL standard from ((select c1 from t1) union (select c1 from t2)) t1t2; Derby implementation 1)Start with from ((select c1 from t1) union (select c1 from t2)) t1t2; 2)Following rule consumes token from and the remaining string is handled by <table reference> <from clause> ::= FROM <table reference> Before <table reference> takes over, the string looks as follows ((select c1 from t1) union (select c1 from t2)) t1t2; 3)Following rules will be used next <table reference> ::= <table factor> <table factor> ::= <left paren> <table reference> <right paren> Before going into the rule about <table reference> and after <left paren> consumption, our string will look as follows (select c1 from t1) union (select c1 from t2)) t1t2; The processing done by <table reference> <right paren> is covered in 4) 4)Following rules will be used next <table reference> ::= <table factor> <table factor> ::= <derived table> <derived table> ::= <left paren> <table subquery> <right paren> Before going into the rule about <table subquery> and after <left paren> consumption, our string will look as follows select c1 from t1) union (select c1 from t2)) t1t2; 5)Next comes rules decided by <table subquery> <right paren> <table subquery> ::= <queryExpression> <queryExpression> ::= SELECT <column list> FROM <table reference> [UNION] ****** ******Very important to note that optional UNION token is checked by <queryExpression> and I think this is what is wrong. We either need some other place also to look for UNION or this [UNION] check should be moved out from here somewhere up. UNION or this [UNION] check should be moved out from here somewhere up. When <queryExpression> gets to optional UNION token check, our string will look as follows ) union (select c1 from t2)) t1t2; 5)Since the next token is not UNION, rather it is ), we get out of <queryExpression>, <table sunqery> and come to the <right paren> part of <derived table> as shown below <derived table> ::= <left paren> <table subquery> <right paren> After the <right paren> processing, our string looks as follows union (select c1 from t2)) t1t2; Notice now that the next token is union and we are going up the stack of rules, out from <derived table>, <table factor>, <table reference> and onto <table factor> again <table factor> ::= <left paren> <table reference> <right paren> Here, we are looking for <right paren> but what we have in the string is UNION and this is when the parser throws exception for unexpected UNION as per it's rules SQL spec For the same query above, this is how SQL spec rules kick in 1)Start with from ((select c1 from t1) union (select c1 from t2)) t1t2; <from clause> ::= <FROM> <table reference> After the <FROM> consumption, our string will look as follows ((select c1 from t1) union (select c1 from t2)) t1t2; 2)Following 4 rules kick in <table reference> ::= <table factor> <table factor> ::= <derived table> <derived table> ::= <table subquery> <table subquery> ::= <left paren> <query expression> <right paren> At the end of <left paren> consumption, our original sql will now look like following (select c1 from t1) union (select c1 from t2)) t1t2; 3)First the <query expression body> of the rule below will kick in <query expression> ::= <query expression body> UNION <query term> <query expression body> ::= <query term> <query term> ::= <left paren> <query expression body> <right paren> After <left paren> from <query term>, now we have select c1 from t1) union (select c1 from t2)) t1t2; 4)Once the <query term> below is finished, we will have union (select c1 from t2)) t1t2; 5)Now we move up the stack of the rules fired so far and when we get back to <query expression> ::= <query expression body> UNION <query term>, we are done with <query expression body> and we will consume UNION from original sql as follows (select c1 from t2)) t1t2; So, as we see from the SQL spec rules, it has proper rules in place to handle parentheses around the left term of UNION which Derby rules do not support at this point.
        Hide
        Mamta A. Satoor added a comment -

        I am planning to pursue a possible fix by pushing down consumption <left paren> and <right paren> from <derived table> rule. Right now, the rule is
        <derived table> ::= <left paren. <table subquery> <right paren>
        And within <table subquery> grammar chain, we check for UNION clause which is little too early because <right paren> has not been consumed yet.

        I am working on it right now. Hopefully it will lead to a solution. If anyone sees anything wrong with this approach, please let me know.

        Show
        Mamta A. Satoor added a comment - I am planning to pursue a possible fix by pushing down consumption <left paren> and <right paren> from <derived table> rule. Right now, the rule is <derived table> ::= <left paren. <table subquery> <right paren> And within <table subquery> grammar chain, we check for UNION clause which is little too early because <right paren> has not been consumed yet. I am working on it right now. Hopefully it will lead to a solution. If anyone sees anything wrong with this approach, please let me know.
        Hide
        Mamta A. Satoor added a comment -

        Just wanted to post results of parser behavior when I push the consumption of ( further down in the derived table rules for the query below.
        select c1 from ((select c1 from t1) union (select c1 from t2)) t1t2;

        For the query above, Derby currently chooses one of the following 2 rules for the string after the first "from" clause
        1) <table reference> = <left paren> <table reference> <right paren>
        2) <table reference> = <derived table> [AS] <correlation name>
        The decision is based on the fact if there is a "select" clause following ( or not. If there is (Select combination, then Derby used the 2nd rule. Based on this, since right after the first "from" clause, we just have ( and not (select, we choose rule 1). Next, the tokens are (select and we choose rule 2). The <derived table> leads to consuming (select c1 from t1) and next we look for [AS] <correlation name> but what we have is UNION and hence parser error. I changed the rule for <derived table> such that if there is a UNION clause after the ), then we should continue handling UNION grammar. This was a progress because now <derived table> can consume (select c1 from t1) union (select c1 from t2). But after this, we look for [AS] <correlation name> again but what we get is the last ) for the query above. This ) would have been consumed by rule 1 which was applied first but we are never going to get to that point because we are still working on the <derived table> grammar and that grammar does not see [AS] <correlation name> and because of that parser throws an exception for missing correlation name even with my changes. I think what is happeneing is we are incorrectly identifying the queries around UNION clause to be not derived tables. Changing that might fix the problem but at this point, I am not sure how involved that is.

        Also, there is following piece of code in the parser for tableFactor
        // There is a grammar ambiguity with nested parentheses here.
        // A series of left parentheses could introduce either a table
        // reference or a derived table. For example:
        //
        // (((select c from t) a inner join (select d from s) b ))
        //
        // and:
        //
        // (((select c from t) a)))
        //
        // To distinguish these two cases, we consider anything that starts
        // with a single parenthesis and either SELECT or VALUES to be
        // a derived table, and anything else to be a table reference.
        // Note that we can't use the subqueryFollows() lookahead method,
        // because it skips over all leading left parentheses to decide
        // whether a subquery follows.
        LOOKAHEAD(

        { getToken(1).kind == LEFT_PAREN && ( getToken(2).kind == SELECT || getToken(2).kind == VALUES ) }

        )
        derivedTable = derivedTable() [ <AS> ] correlationName = identifier(Limits.MAX_IDENTIFIER_LENGTH, true)
        [ <LEFT_PAREN> derivedRCL = derivedColumnList() <RIGHT_PAREN> ]
        [ optionalTableClauses = optionalTableProperties() ]
        I tried writing a query using inner join to see how this code path is used but couldn't quite come up with a query satisfying the comment
        // (((select c from t) a inner join (select d from s) b ))
        It will be interesting to see what query are we trying to solve by this special code as per the comments.

        Show
        Mamta A. Satoor added a comment - Just wanted to post results of parser behavior when I push the consumption of ( further down in the derived table rules for the query below. select c1 from ((select c1 from t1) union (select c1 from t2)) t1t2; For the query above, Derby currently chooses one of the following 2 rules for the string after the first "from" clause 1) <table reference> = <left paren> <table reference> <right paren> 2) <table reference> = <derived table> [AS] <correlation name> The decision is based on the fact if there is a "select" clause following ( or not. If there is (Select combination, then Derby used the 2nd rule. Based on this, since right after the first "from" clause, we just have ( and not (select, we choose rule 1). Next, the tokens are (select and we choose rule 2). The <derived table> leads to consuming (select c1 from t1) and next we look for [AS] <correlation name> but what we have is UNION and hence parser error. I changed the rule for <derived table> such that if there is a UNION clause after the ), then we should continue handling UNION grammar. This was a progress because now <derived table> can consume (select c1 from t1) union (select c1 from t2). But after this, we look for [AS] <correlation name> again but what we get is the last ) for the query above. This ) would have been consumed by rule 1 which was applied first but we are never going to get to that point because we are still working on the <derived table> grammar and that grammar does not see [AS] <correlation name> and because of that parser throws an exception for missing correlation name even with my changes. I think what is happeneing is we are incorrectly identifying the queries around UNION clause to be not derived tables. Changing that might fix the problem but at this point, I am not sure how involved that is. Also, there is following piece of code in the parser for tableFactor // There is a grammar ambiguity with nested parentheses here. // A series of left parentheses could introduce either a table // reference or a derived table. For example: // // (((select c from t) a inner join (select d from s) b )) // // and: // // (((select c from t) a))) // // To distinguish these two cases, we consider anything that starts // with a single parenthesis and either SELECT or VALUES to be // a derived table, and anything else to be a table reference. // Note that we can't use the subqueryFollows() lookahead method, // because it skips over all leading left parentheses to decide // whether a subquery follows. LOOKAHEAD( { getToken(1).kind == LEFT_PAREN && ( getToken(2).kind == SELECT || getToken(2).kind == VALUES ) } ) derivedTable = derivedTable() [ <AS> ] correlationName = identifier(Limits.MAX_IDENTIFIER_LENGTH, true) [ <LEFT_PAREN> derivedRCL = derivedColumnList() <RIGHT_PAREN> ] [ optionalTableClauses = optionalTableProperties() ] I tried writing a query using inner join to see how this code path is used but couldn't quite come up with a query satisfying the comment // (((select c from t) a inner join (select d from s) b )) It will be interesting to see what query are we trying to solve by this special code as per the comments.
        Hide
        Mamta A. Satoor added a comment -

        Attached patch DERBY2374_UnionWithParentheseOnLeftRS_diff_patch1.txt is NOT for commit. I am not planning on working on this jira at this point but wanted to go ahead and post the patch for what I had tried(as mentioned in the previous comment. I was trying to postpone the ( consumption later on in the grammar of <derived table> but that has not fixed the problem. We may need to change the grammar even before we decide to use <dervied table>. At least that is what comparison of Derby grammar with SQL specification grammar seem to indicate.)

        Show
        Mamta A. Satoor added a comment - Attached patch DERBY2374_UnionWithParentheseOnLeftRS_diff_patch1.txt is NOT for commit. I am not planning on working on this jira at this point but wanted to go ahead and post the patch for what I had tried(as mentioned in the previous comment. I was trying to postpone the ( consumption later on in the grammar of <derived table> but that has not fixed the problem. We may need to change the grammar even before we decide to use <dervied table>. At least that is what comparison of Derby grammar with SQL specification grammar seem to indicate.)
        Hide
        Bryan Pendleton added a comment -

        Thanks for doing the research Mamta, and for posting the information. It was very interesting to read.

        I think we should (incrementally) work to make our grammar match the SQL standard grammar.

        Show
        Bryan Pendleton added a comment - Thanks for doing the research Mamta, and for posting the information. It was very interesting to read. I think we should (incrementally) work to make our grammar match the SQL standard grammar.
        Hide
        Lukas Eder added a comment - - edited

        Thanks for all the detailed analyses. This would be a nice to have fix. In the mean-time, I'll try to create workarounds for that.
        Please note that it also affects version 10.7.1.1

        Show
        Lukas Eder added a comment - - edited Thanks for all the detailed analyses. This would be a nice to have fix. In the mean-time, I'll try to create workarounds for that. Please note that it also affects version 10.7.1.1

          People

          • Assignee:
            Unassigned
            Reporter:
            Kenneth Gee
          • Votes:
            2 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:

              Development