Derby
  1. Derby
  2. DERBY-4079

Add support for SQL:2008 <result offset clause> and <fetch first clause> to limit result set cardinality

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 10.5.1.1
    • Component/s: SQL
    • Labels:
      None

      Description

      SQL 2008 has added new syntax to support a direct way to limit the
      returned set of rows in a result set. This allows an application to
      retrieve only some rows of an otherwise larger result set, similar to
      the popular LIMIT clauses use in some databases.

      Up till now, in Derby (and SQL) we have had to use the ROW_NUMBER()
      function in a nested subquery to achieve the effect of the <fetch
      first clause>, cf. DERBY-2998, a method which is rather more indirect
      and still not efficient (DERBY-3505), and primarily intended for OLAP
      functionality, perhaps.

      There has been no direct way to achieve the effect of the <result
      offset clause> via SQL.

      Syntax (cf. SQL 2008, section 7.13):
      <result offset clause> ::= OFFSET <n>

      {ROW | ROWS}
      <fetch first clause> ::= FETCH {FIRST | NEXT} [<n>] {ROW | ROWS}

      ONLY

      where <n> is an integer. The two clauses syntactically follow the ORDER BY
      clause in the grammar.

      Note that both ORDER BY and the new clauses above are allowed also in
      subqueries in the new version of the SQL standard (section 7.13). I
      only propose to include this at the top level in DERBY for now. (ORDER
      BY is presently also not allowed in subqueries in Derby since SQL
      didn't allow for this until SQL 2008 either).

      1. rrefsqljoffsetfetch.html
        4 kB
        Kim Haase
      2. ref.zip
        5 kB
        Dag H. Wanvik
      3. derby-4079-fixRtStatsTest.stat
        0.1 kB
        Dag H. Wanvik
      4. derby-4079-fixRtStatsTest.diff
        1.0 kB
        Dag H. Wanvik
      5. derby-4079-docs-2.zip
        6 kB
        Dag H. Wanvik
      6. derby-4079-docs-2.stat
        0.1 kB
        Dag H. Wanvik
      7. DERBY-4079-docs-2.diff
        0.6 kB
        Kim Haase
      8. derby-4079-docs-2.diff
        6 kB
        Dag H. Wanvik
      9. derby-4079-docs-1.zip
        6 kB
        Dag H. Wanvik
      10. derby-4079-docs-1.stat
        0.1 kB
        Dag H. Wanvik
      11. derby-4079-docs-1.diff
        6 kB
        Dag H. Wanvik
      12. derby-4079-3.stat
        1 kB
        Dag H. Wanvik
      13. derby-4079-3.diff
        64 kB
        Dag H. Wanvik
      14. derby-4079-2.stat
        1 kB
        Dag H. Wanvik
      15. derby-4079-2.diff
        80 kB
        Dag H. Wanvik
      16. derby-4079-1.stat
        1 kB
        Dag H. Wanvik
      17. derby-4079-1.diff
        60 kB
        Dag H. Wanvik

        Issue Links

          Activity

          Dag H. Wanvik created issue -
          Dag H. Wanvik made changes -
          Field Original Value New Value
          Link This issue is related to DERBY-2998 [ DERBY-2998 ]
          Dag H. Wanvik made changes -
          Link This issue is related to DERBY-581 [ DERBY-581 ]
          Dag H. Wanvik made changes -
          Assignee Dag H. Wanvik [ dagw ]
          Hide
          Dag H. Wanvik added a comment -

          Uploading ref.zip, which contains HTML version of possible docs for this feature for refman.

          Show
          Dag H. Wanvik added a comment - Uploading ref.zip, which contains HTML version of possible docs for this feature for refman.
          Dag H. Wanvik made changes -
          Attachment ref.zip [ 12401282 ]
          Dag H. Wanvik made changes -
          Description SQL 2008 has added new syntax to support a direct way to limit the
          returned set of rows in a result set. This allows an application to
          retrieve only some rows of an otherwise larger result set, similar to
          the popular LIMIT clauses use in some databases.

          Up till now, in Derby (and SQL) we have had to use the ROW_NUMBER()
          function in a nested subquery to achieve the effect of the <fetch
          first clause>, cf. DERBY-2998, a method which is rather more indirect
          and still not efficient (DERBY-3505), and primarily intended for OLAP
          functionality, perhaps.

          There has been no direct way to achieve the effect of the <result
          offset clause> via SQL.

          Syntax (cf. SQL 2008, section 7.13):
                 <result offset clause> ::= OFFSET <n> {ROW | ROWS}
                 <fetch first clause> ::= FETCH {FIRST | NEXT} [<n>] {ROW | ROWS} ONLY

          where <n> is an integer. It syntactically follows the ORDER BY
          clause.

          Note that both ORDER BY and the new clauses above are allowed also in
          subqueries in the new version of the SQL standard (section 7.13). I
          only propose to include this at the top level in DERBY for now. (ORDER
          BY is presently also not allowed in subqueries in Derby since SQL
          didn't allow for this until SQL 2008 either).


          SQL 2008 has added new syntax to support a direct way to limit the
          returned set of rows in a result set. This allows an application to
          retrieve only some rows of an otherwise larger result set, similar to
          the popular LIMIT clauses use in some databases.

          Up till now, in Derby (and SQL) we have had to use the ROW_NUMBER()
          function in a nested subquery to achieve the effect of the <fetch
          first clause>, cf. DERBY-2998, a method which is rather more indirect
          and still not efficient (DERBY-3505), and primarily intended for OLAP
          functionality, perhaps.

          There has been no direct way to achieve the effect of the <result
          offset clause> via SQL.

          Syntax (cf. SQL 2008, section 7.13):
                 <result offset clause> ::= OFFSET <n> {ROW | ROWS}
                 <fetch first clause> ::= FETCH {FIRST | NEXT} [<n>] {ROW | ROWS} ONLY

          where <n> is an integer. The two clauses syntactically follow the ORDER BY
          clause in the grammar.

          Note that both ORDER BY and the new clauses above are allowed also in
          subqueries in the new version of the SQL standard (section 7.13). I
          only propose to include this at the top level in DERBY for now. (ORDER
          BY is presently also not allowed in subqueries in Derby since SQL
          didn't allow for this until SQL 2008 either).


          Hide
          Knut Anders Hatlen added a comment -

          I don't have a copy of SQL 2008 yet, so I can't comment on the syntax, but the docs look good to me. I did however struggle with understanding this sentence:

          > If the underlying result set is not materialized (depends on query), the use of this clause can give efficiency benefits.

          What I don't quite understand, is the implication that we won't see efficiency benefits if the underlying result set is materialized, but I'm probably just misunderstanding what you are comparing (or perhaps misunderstanding how you're planning to implement it).

          If you are comparing the time it takes to execute SELECT * FROM LARGE_TABLE and SELECT * FROM LARGE_TABLE FETCH FIRST 10 ROWS ONLY and fetch the full result, I understand that it would be faster with the FETCH clause than without if the underlying result set is not materialized, because you just stop after fetching the first 10 rows whereas the other query goes through the entire table. If the underlying result set is materialized, I understand that it's probably not going to speed it up significantly in embedded mode, since the same number of rows must be processed in both queries. In a client/server environment it reduces the number of rows to send over the wire, so there we should see an efficiency benefit even if the result set is materialized, I believe.

          If you are comparing the time it takes to read out the columns you're interested in (just quit calling rs.next() once you've fetched N rows), I don't see that it should make much difference at all whether or not OFFSET/FETCH is used (assuming that the implementation will basically do the same thing, just on a slightly lower level). Again, in a client/server environment you may see higher performance with OFFSET/FETCH because the skipped rows don't need to go over the wire, and because you limit the number of rows that will be pre-fetched. But I don't see how this benefit will be affected by whether or not the underlying result set is materialized.

          Perhaps it suffices to say "The use of this clause can give efficiency benefits for some queries" and not qualify it further?

          Show
          Knut Anders Hatlen added a comment - I don't have a copy of SQL 2008 yet, so I can't comment on the syntax, but the docs look good to me. I did however struggle with understanding this sentence: > If the underlying result set is not materialized (depends on query), the use of this clause can give efficiency benefits. What I don't quite understand, is the implication that we won't see efficiency benefits if the underlying result set is materialized, but I'm probably just misunderstanding what you are comparing (or perhaps misunderstanding how you're planning to implement it). If you are comparing the time it takes to execute SELECT * FROM LARGE_TABLE and SELECT * FROM LARGE_TABLE FETCH FIRST 10 ROWS ONLY and fetch the full result, I understand that it would be faster with the FETCH clause than without if the underlying result set is not materialized, because you just stop after fetching the first 10 rows whereas the other query goes through the entire table. If the underlying result set is materialized, I understand that it's probably not going to speed it up significantly in embedded mode, since the same number of rows must be processed in both queries. In a client/server environment it reduces the number of rows to send over the wire, so there we should see an efficiency benefit even if the result set is materialized, I believe. If you are comparing the time it takes to read out the columns you're interested in (just quit calling rs.next() once you've fetched N rows), I don't see that it should make much difference at all whether or not OFFSET/FETCH is used (assuming that the implementation will basically do the same thing, just on a slightly lower level). Again, in a client/server environment you may see higher performance with OFFSET/FETCH because the skipped rows don't need to go over the wire, and because you limit the number of rows that will be pre-fetched. But I don't see how this benefit will be affected by whether or not the underlying result set is materialized. Perhaps it suffices to say "The use of this clause can give efficiency benefits for some queries" and not qualify it further?
          Hide
          Dag H. Wanvik added a comment -

          Thanks for looking at this, Knut. You are right, there as many aspects
          of the efficiency calculations here, so it is probably better to avoid
          getting into details here (some stuff could go into the tuning manual
          perhaps). As for the materialization point; i was thinking of the case
          where the app fetches the entire result set. In the client/server
          case, the benefits from avoiding shipping all rows over the wire could
          in deed be important, too.

          I will post a preliminary patch in day day or two; I have one that
          seems to work in my sandbox, but I need to clean it up a little first.

          Show
          Dag H. Wanvik added a comment - Thanks for looking at this, Knut. You are right, there as many aspects of the efficiency calculations here, so it is probably better to avoid getting into details here (some stuff could go into the tuning manual perhaps). As for the materialization point; i was thinking of the case where the app fetches the entire result set. In the client/server case, the benefits from avoiding shipping all rows over the wire could in deed be important, too. I will post a preliminary patch in day day or two; I have one that seems to work in my sandbox, but I need to clean it up a little first.
          Hide
          Dag H. Wanvik added a comment -

          As noted in DERBY-581, new optimization opportunities present themselves to the
          compiler when information on the result set cardinality is know at compilation time.
          This may give further efficiency benefits for some queries.

          Show
          Dag H. Wanvik added a comment - As noted in DERBY-581 , new optimization opportunities present themselves to the compiler when information on the result set cardinality is know at compilation time. This may give further efficiency benefits for some queries.
          Hide
          Kim Haase added a comment -

          The draft docs look very good. A few questions and minor language corrections –

          SELECT statement (rrefsqlj41360.html):

          Should you add a mention of the two new clauses to the first sentence? It still mentions only ORDER BY and FOR UPDATE.

          Also, I am wondering if the syntax should put square brackets around

          WITH

          {RR|RS|CS|UR}

          since this part is also optional? (There should in theory be a separate issue filed for this, but it might never get fixed ...)

          In the sentence about the two new clauses, there should be no "s" after the second one. That is, it should say "The result offset clause and the fetch first clause can be used ..."

          The sentence "The SELECT statement supports the FOR FETCH ONLY clause." is confusing to me because the syntax in the rrefsqljoffsetfetch.html topic doesn't include FOR. Should it?

          rrefsqljoffsetfetch.html:

          Title should be "The result offset and fetch first clauses".

          In the 3rd sentence, change "usually combined also with" to "usually in combination with".

          I don't think you need quite so much vertical space around the clauses in the Syntax section.

          Change

          "The use of ROW or ROWS and FIRST or NEXT are equivalent."

          to

          "The use of ROW or ROWS and the use of FIRST or NEXT are equivalent."

          I expect you'll be adding some examples to either this topic or the SELECT statement topic or both?

          If you decide to add anything on this subject to the tuning guide topic on materialization, you could presumably reference it here.

          Show
          Kim Haase added a comment - The draft docs look very good. A few questions and minor language corrections – SELECT statement (rrefsqlj41360.html): Should you add a mention of the two new clauses to the first sentence? It still mentions only ORDER BY and FOR UPDATE. Also, I am wondering if the syntax should put square brackets around WITH {RR|RS|CS|UR} since this part is also optional? (There should in theory be a separate issue filed for this, but it might never get fixed ...) In the sentence about the two new clauses, there should be no "s" after the second one. That is, it should say "The result offset clause and the fetch first clause can be used ..." The sentence "The SELECT statement supports the FOR FETCH ONLY clause." is confusing to me because the syntax in the rrefsqljoffsetfetch.html topic doesn't include FOR. Should it? rrefsqljoffsetfetch.html: Title should be "The result offset and fetch first clauses". In the 3rd sentence, change "usually combined also with" to "usually in combination with". I don't think you need quite so much vertical space around the clauses in the Syntax section. Change "The use of ROW or ROWS and FIRST or NEXT are equivalent." to "The use of ROW or ROWS and the use of FIRST or NEXT are equivalent." I expect you'll be adding some examples to either this topic or the SELECT statement topic or both? If you decide to add anything on this subject to the tuning guide topic on materialization, you could presumably reference it here.
          Hide
          Dag H. Wanvik added a comment -

          Uploading a patch for this functionality (only for comments at this
          point). It does not yet contain a new JUnit test for the feature,
          I'll add that in the next spin.

          I have tested the patch ad hoc, and it seems for work for what I have
          thrown at it so far.

          Since this feature cuts across a lot of underlying result sets and
          modes (read-only, scrollable, updatable, subqueries etc), it is hard
          to write an exhaustive test. To test whether the presence of an
          additional result set on top of other result sets breaks anything, I
          have done the following experiment: I modified the parser so as to
          always stick in an OFFSET 0 ROWS clause for a top level SELECT. This
          should have no impact on the result set returned. With that
          modification, I ran the regressions. I uncovered some small issues
          with updatable result sets in this way. Presently, the only tests
          which fail are canon based tests which compare the execution plan with
          a canon (a diff is to be expected here). For a couple of those I
          compared the plans manually to verify that the only difference was the
          presence of a wrapper result set.

          Patch details:

          M java/engine/org/apache/derby/impl/sql/compile/sqlgrammar.jj

          Add the new syntax and stick offset/fetch first values into CursorNode.

          M java/engine/org/apache/derby/impl/sql/compile/NodeFactoryImpl.java
          M java/engine/org/apache/derby/impl/sql/compile/C_NodeNames.java
          M java/engine/org/apache/derby/iapi/sql/compile/C_NodeTypes.java

          Accommodate the new node type RowCountNode.

          A java/engine/org/apache/derby/impl/sql/compile/RowCountNode.java

          This node is inserted at optimize time, see CursorNode,
          DMLStatementNode changes. At generation time it will insert a
          RowCountResultSet over the top SELECTs result set, but underneath any
          ScrollInsensitiveResultSet.

          M java/engine/org/apache/derby/impl/sql/compile/ResultColumn.java

          Adds copying of table name and schema name to the cloneMe method. his
          was needed to handle updatable result sets which expect to see the
          table name in the top result columns. Is this change safe?

          M java/engine/org/apache/derby/impl/sql/compile/CursorNode.java
          M java/engine/org/apache/derby/impl/sql/compile/DMLStatementNode.java

          Bind offset/fetch first values (type check, range check) and add a
          RowCountNode to the abstract syntax tree at optimize time. I found
          that inserting this node prior to optimizing the underlying query did
          not work, notably GROUP BY failed.

          M java/engine/org/apache/derby/impl/sql/compile/ValueNode.java

          Added a method to check if a value is an integer or a bigint, needed
          at bind time of the row counts.

          A java/engine/org/apache/derby/impl/sql/execute/RowCountResultSet.java

          This new result set implements the filtering required by this
          functionality. It mostly forwards to it child result set otherwise.

          A java/engine/org/apache/derby/impl/sql/execute/rts/RealRowCountStatistics.java
          M java/engine/org/apache/derby/impl/sql/execute/RealResultSetStatisticsFactory.java

          Adds run time statistics handling for the new result set.

          M java/engine/org/apache/derby/impl/sql/execute/NoPutResultSetImpl.java

          Relaxed a "final" to be able to override clearCurrentRow in
          RowCountResultSet. RowCountResultSet#clearCurrentRow actually calls
          clearCurrentRow of its source result set, since
          RowCountResultSet#getCurrentRow also asks its source result set for
          the current row. Forwarding these to the child, may not be entirely
          kosher, but I had to do it this way to make it work with the updatable
          cursors. Suggestions of better ways to handle this are welcome.

          M java/engine/org/apache/derby/impl/sql/execute/ScrollInsensitiveResultSet.java

          updateRow will look at the result set for a top ProjectRestrictNode
          and use it for projection. I added code here to get at the
          ProjectRestrictNode from underneath a RowCountResultSet.

          M java/engine/org/apache/derby/impl/sql/execute/GenericResultSetFactory.java
          M java/engine/org/apache/derby/iapi/sql/execute/ResultSetFactory.java

          Accommodate the new result set type.

          M java/engine/org/apache/derby/iapi/types/TypeId.java

          Made BIGINT_ID public, needed by the new method in ValueNode (checkIsInteger).

          M java/engine/org/apache/derby/loc/messages.xml
          M java/shared/org/apache/derby/shared/common/reference/SQLState.java

          New error messages/execution plan i18n strings and English text for
          them.

          M tools/jar/DBMSnodes.properties

          Show
          Dag H. Wanvik added a comment - Uploading a patch for this functionality (only for comments at this point). It does not yet contain a new JUnit test for the feature, I'll add that in the next spin. I have tested the patch ad hoc, and it seems for work for what I have thrown at it so far. Since this feature cuts across a lot of underlying result sets and modes (read-only, scrollable, updatable, subqueries etc), it is hard to write an exhaustive test. To test whether the presence of an additional result set on top of other result sets breaks anything, I have done the following experiment: I modified the parser so as to always stick in an OFFSET 0 ROWS clause for a top level SELECT. This should have no impact on the result set returned. With that modification, I ran the regressions. I uncovered some small issues with updatable result sets in this way. Presently, the only tests which fail are canon based tests which compare the execution plan with a canon (a diff is to be expected here). For a couple of those I compared the plans manually to verify that the only difference was the presence of a wrapper result set. Patch details: M java/engine/org/apache/derby/impl/sql/compile/sqlgrammar.jj Add the new syntax and stick offset/fetch first values into CursorNode. M java/engine/org/apache/derby/impl/sql/compile/NodeFactoryImpl.java M java/engine/org/apache/derby/impl/sql/compile/C_NodeNames.java M java/engine/org/apache/derby/iapi/sql/compile/C_NodeTypes.java Accommodate the new node type RowCountNode. A java/engine/org/apache/derby/impl/sql/compile/RowCountNode.java This node is inserted at optimize time, see CursorNode, DMLStatementNode changes. At generation time it will insert a RowCountResultSet over the top SELECTs result set, but underneath any ScrollInsensitiveResultSet. M java/engine/org/apache/derby/impl/sql/compile/ResultColumn.java Adds copying of table name and schema name to the cloneMe method. his was needed to handle updatable result sets which expect to see the table name in the top result columns. Is this change safe? M java/engine/org/apache/derby/impl/sql/compile/CursorNode.java M java/engine/org/apache/derby/impl/sql/compile/DMLStatementNode.java Bind offset/fetch first values (type check, range check) and add a RowCountNode to the abstract syntax tree at optimize time. I found that inserting this node prior to optimizing the underlying query did not work, notably GROUP BY failed. M java/engine/org/apache/derby/impl/sql/compile/ValueNode.java Added a method to check if a value is an integer or a bigint, needed at bind time of the row counts. A java/engine/org/apache/derby/impl/sql/execute/RowCountResultSet.java This new result set implements the filtering required by this functionality. It mostly forwards to it child result set otherwise. A java/engine/org/apache/derby/impl/sql/execute/rts/RealRowCountStatistics.java M java/engine/org/apache/derby/impl/sql/execute/RealResultSetStatisticsFactory.java Adds run time statistics handling for the new result set. M java/engine/org/apache/derby/impl/sql/execute/NoPutResultSetImpl.java Relaxed a "final" to be able to override clearCurrentRow in RowCountResultSet. RowCountResultSet#clearCurrentRow actually calls clearCurrentRow of its source result set, since RowCountResultSet#getCurrentRow also asks its source result set for the current row. Forwarding these to the child, may not be entirely kosher, but I had to do it this way to make it work with the updatable cursors. Suggestions of better ways to handle this are welcome. M java/engine/org/apache/derby/impl/sql/execute/ScrollInsensitiveResultSet.java updateRow will look at the result set for a top ProjectRestrictNode and use it for projection. I added code here to get at the ProjectRestrictNode from underneath a RowCountResultSet. M java/engine/org/apache/derby/impl/sql/execute/GenericResultSetFactory.java M java/engine/org/apache/derby/iapi/sql/execute/ResultSetFactory.java Accommodate the new result set type. M java/engine/org/apache/derby/iapi/types/TypeId.java Made BIGINT_ID public, needed by the new method in ValueNode (checkIsInteger). M java/engine/org/apache/derby/loc/messages.xml M java/shared/org/apache/derby/shared/common/reference/SQLState.java New error messages/execution plan i18n strings and English text for them. M tools/jar/DBMSnodes.properties
          Dag H. Wanvik made changes -
          Attachment derby-4079-1.diff [ 12401349 ]
          Attachment derby-4079-1.stat [ 12401350 ]
          Hide
          Dag H. Wanvik added a comment -

          Thanks for looking at the docs, Kim!

          To answer your question, the FOR FETCH ONLY in an other FETCH entirely, so that's why it is not mentioned.

          Uploading a patch, derby-4079-docs-1 and an accompanying zip file with the htmls, which fixes
          your comments, hopefully.

          Show
          Dag H. Wanvik added a comment - Thanks for looking at the docs, Kim! To answer your question, the FOR FETCH ONLY in an other FETCH entirely, so that's why it is not mentioned. Uploading a patch, derby-4079-docs-1 and an accompanying zip file with the htmls, which fixes your comments, hopefully.
          Dag H. Wanvik made changes -
          Attachment derby-4079-docs-1.zip [ 12401355 ]
          Attachment derby-4079-docs-1.diff [ 12401353 ]
          Attachment derby-4079-docs-1.stat [ 12401354 ]
          Dag H. Wanvik made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          Dag H. Wanvik made changes -
          Fix Version/s 10.5.0.0 [ 12313010 ]
          Derby Info [Patch Available]
          Hide
          Dag H. Wanvik added a comment -

          Marking "Existing Application Impact" since new reserved keywords are added by the patch.

          Show
          Dag H. Wanvik added a comment - Marking "Existing Application Impact" since new reserved keywords are added by the patch.
          Dag H. Wanvik made changes -
          Derby Info [Patch Available] [Patch Available, Existing Application Impact]
          Dag H. Wanvik made changes -
          Attachment derby-4079-docs-1.diff [ 12401353 ]
          Dag H. Wanvik made changes -
          Attachment derby-4079-docs-1.zip [ 12401355 ]
          Dag H. Wanvik made changes -
          Attachment derby-4079-docs-1.stat [ 12401354 ]
          Dag H. Wanvik made changes -
          Attachment derby-4079-docs-1.diff [ 12401358 ]
          Attachment derby-4079-docs-1.zip [ 12401360 ]
          Attachment derby-4079-docs-1.stat [ 12401359 ]
          Hide
          Kim Haase added a comment -

          Thanks, Dag, for the doc updates. The SELECT statement topic fixes look great.

          rrefsqljoffsetfetch.html looks great too, though it could use a couple of changes still:

          I think I messed up on the first sentence after the syntax statement. It now appears to say that the use of ROW or ROWS is equivalent to the use of FIRST or NEXT ... not true at all. Perhaps the following (taking a tip from the sentence about FOR FETCH ONLY and FOR READ ONLY in the SELECT statement topic):

          ROW is synonymous with ROWS, and FIRST is synonymous with NEXT.

          Also, the second sentence in the topic has an unitalicized "result offset clause" – all other occurrences are italicized. (You could also remove the italics from all but the first occurrence of each clause – your choice.)

          The examples are helpful. Is there a reason why only the last one has a semicolon?

          In the refderby.ditamap file, the topic title should be the same as the one in the topic.

          Thanks for the explanation of FOR FETCH ONLY. I now see it is explained in the FOR UPDATE topic.

          Show
          Kim Haase added a comment - Thanks, Dag, for the doc updates. The SELECT statement topic fixes look great. rrefsqljoffsetfetch.html looks great too, though it could use a couple of changes still: I think I messed up on the first sentence after the syntax statement. It now appears to say that the use of ROW or ROWS is equivalent to the use of FIRST or NEXT ... not true at all. Perhaps the following (taking a tip from the sentence about FOR FETCH ONLY and FOR READ ONLY in the SELECT statement topic): ROW is synonymous with ROWS, and FIRST is synonymous with NEXT. Also, the second sentence in the topic has an unitalicized "result offset clause" – all other occurrences are italicized. (You could also remove the italics from all but the first occurrence of each clause – your choice.) The examples are helpful. Is there a reason why only the last one has a semicolon? In the refderby.ditamap file, the topic title should be the same as the one in the topic. Thanks for the explanation of FOR FETCH ONLY. I now see it is explained in the FOR UPDATE topic.
          Hide
          Rick Hillegas added a comment -

          Thanks for the experimental patch, Dag. The overall approach looks good to me. I have a couple small comments:

          ResultColumn

          The new methods have generic (possibly cloned?) javadoc.

          CursorNode

          It seems to me that since the new offset and fetchFirst fields are known to be NumericConstantNodes, then they should be declared as the most refined type possible, rather than as generic ValueNodes. I think this will make the code easier to understand.

          DMLStatementNode

          Thanks for the explanation of why you are wrapping the inner result set at that particular point in the code. I find that these explanations are easier to understand if they include a sample problem query which would break if the code were placed somewhere else.

          sqlgrammar.jj

          Do the new FETCH and OFFSET keywords need to be reserved? I realize that the standard may suggest that they should be, but there are plenty of examples of Derby downgrading reserved keywords to non-reserved keywords in order to avoid breaking legacy applications. Same question for ROW, which appears to have been moved from non-reserved to reserved status.

          Similar comment to that for CursorNode: if the new clauses are just NumericConstantNodes, then it makes sense to me to declare them as that refined type.

          I wonder if the offsetClause() and fetchFirstClause() productions could use numericLiteral() rather than literal(). With a little more work, you might be able to get the parser to guarantee that the literal is a non-negative integer and you might be able to skip this check at bind() time.

          GenericResultSetFactory

          I wonder if all of the duplicated debug code could be pulled up into a superclass constructor. Something like

          if (SanityManager.DEBUG && SanityManager.DEBUG_ON("DumpResultSetGeneration"))

          { SanityManager.GET_DEBUG_STREAM().print("Generated " + getClass().getName() + "\n"); }
          Show
          Rick Hillegas added a comment - Thanks for the experimental patch, Dag. The overall approach looks good to me. I have a couple small comments: ResultColumn The new methods have generic (possibly cloned?) javadoc. CursorNode It seems to me that since the new offset and fetchFirst fields are known to be NumericConstantNodes, then they should be declared as the most refined type possible, rather than as generic ValueNodes. I think this will make the code easier to understand. DMLStatementNode Thanks for the explanation of why you are wrapping the inner result set at that particular point in the code. I find that these explanations are easier to understand if they include a sample problem query which would break if the code were placed somewhere else. sqlgrammar.jj Do the new FETCH and OFFSET keywords need to be reserved? I realize that the standard may suggest that they should be, but there are plenty of examples of Derby downgrading reserved keywords to non-reserved keywords in order to avoid breaking legacy applications. Same question for ROW, which appears to have been moved from non-reserved to reserved status. Similar comment to that for CursorNode: if the new clauses are just NumericConstantNodes, then it makes sense to me to declare them as that refined type. I wonder if the offsetClause() and fetchFirstClause() productions could use numericLiteral() rather than literal(). With a little more work, you might be able to get the parser to guarantee that the literal is a non-negative integer and you might be able to skip this check at bind() time. GenericResultSetFactory I wonder if all of the duplicated debug code could be pulled up into a superclass constructor. Something like if (SanityManager.DEBUG && SanityManager.DEBUG_ON("DumpResultSetGeneration")) { SanityManager.GET_DEBUG_STREAM().print("Generated " + getClass().getName() + "\n"); }
          Hide
          Dag H. Wanvik added a comment -

          Thanks for looking at this patch, Rick!

          Uploading a new version (#2) which contains a new JUnit test
          (OffsetFetchNextTest) and that also addresses your comments, except
          for the one one ResultSetFactory (more below). I moved the checking of
          row count literal type to the parser as you suggested, but kept the
          range checking in the bind phase (cleaner to me), and also now use the
          more specific value node type.

          I added examples as you suggested in the comment in DMLStatementNode.

          I wasn't able to use a superclass for the SanityManager tests in the
          GenericResultSetFactory (there is no superclass now, and even it there
          were, the classname wold not be helpful (the factory not the result
          set), the code you suggested would need to go into superclasses of the
          result sets themselves, which means I would need to spread this code
          around in the result set (super) classes. I just added that code to be
          able to easily trace what result sets arise at execution time, let me
          know if you think I should remove it (it could be part of another
          patch really..).

          Rerunning regressions.

          Show
          Dag H. Wanvik added a comment - Thanks for looking at this patch, Rick! Uploading a new version (#2) which contains a new JUnit test (OffsetFetchNextTest) and that also addresses your comments, except for the one one ResultSetFactory (more below). I moved the checking of row count literal type to the parser as you suggested, but kept the range checking in the bind phase (cleaner to me), and also now use the more specific value node type. I added examples as you suggested in the comment in DMLStatementNode. I wasn't able to use a superclass for the SanityManager tests in the GenericResultSetFactory (there is no superclass now, and even it there were, the classname wold not be helpful (the factory not the result set), the code you suggested would need to go into superclasses of the result sets themselves, which means I would need to spread this code around in the result set (super) classes. I just added that code to be able to easily trace what result sets arise at execution time, let me know if you think I should remove it (it could be part of another patch really..). Rerunning regressions.
          Dag H. Wanvik made changes -
          Attachment derby-4079-2.stat [ 12401487 ]
          Attachment derby-4079-2.diff [ 12401486 ]
          Hide
          Dag H. Wanvik added a comment -

          Thanks for reviewing the docs again, Kim!

          Uploading a new version of the docs patch (docs-2.*) to handle your comments.
          No, the semicolon was spurious.

          Show
          Dag H. Wanvik added a comment - Thanks for reviewing the docs again, Kim! Uploading a new version of the docs patch (docs-2.*) to handle your comments. No, the semicolon was spurious.
          Dag H. Wanvik made changes -
          Attachment derby-4079-docs-2.diff [ 12401489 ]
          Attachment derby-4079-docs-2.stat [ 12401490 ]
          Attachment derby-4079-docs-2.zip [ 12401491 ]
          Hide
          Kim Haase added a comment -

          The new docs look great, Dag. +1 to commit.

          Show
          Kim Haase added a comment - The new docs look great, Dag. +1 to commit.
          Hide
          Rick Hillegas added a comment -

          Thanks for the second patch, Dag. Sorry about being so elliptical in my earlier comments: Concerning the Sanity printouts, I meant that the printouts could be consolidated in a constructor for the superclass of the constructed result sets themselves, e.g., in a constructor for BasicNoPutResultSetImpl. Thanks.

          Show
          Rick Hillegas added a comment - Thanks for the second patch, Dag. Sorry about being so elliptical in my earlier comments: Concerning the Sanity printouts, I meant that the printouts could be consolidated in a constructor for the superclass of the constructed result sets themselves, e.g., in a constructor for BasicNoPutResultSetImpl. Thanks.
          Hide
          Dag H. Wanvik added a comment -

          Rick, as for the reserved keywords I have been trying to get rid of those;
          moving ROW back to unreserved was painless. Moving OFFSET to unreserved was harder, however,
          since in its location there can be a choice conflict for the parser (with table correlation names).
          I am trying to make a lookahead to circumvent this problem, which if, successful, would allow
          OFFSET to be an identifier in all cases except ones like this:

          SELECT * from t OFFSET

          This would work though
          SELECT * from t AS OFFSET

          Here it is not clear if OFFSET is a correlation name of the start of a result offset clause.
          So, even with this special handling, there could possibly be apps that would break. It might be easier to just
          let OFFSET be a reserved keyword henceforth, since this would be cleaner (grammar, docs)?

          Show
          Dag H. Wanvik added a comment - Rick, as for the reserved keywords I have been trying to get rid of those; moving ROW back to unreserved was painless. Moving OFFSET to unreserved was harder, however, since in its location there can be a choice conflict for the parser (with table correlation names). I am trying to make a lookahead to circumvent this problem, which if, successful, would allow OFFSET to be an identifier in all cases except ones like this: SELECT * from t OFFSET This would work though SELECT * from t AS OFFSET Here it is not clear if OFFSET is a correlation name of the start of a result offset clause. So, even with this special handling, there could possibly be apps that would break. It might be easier to just let OFFSET be a reserved keyword henceforth, since this would be cleaner (grammar, docs)?
          Hide
          Rick Hillegas added a comment -

          Thanks for looking into the keywords issue, Dag. I suspect that OFFSET is a popular column name so adding it as a reserved keyword would create a fairly big compatibility issue.

          Show
          Rick Hillegas added a comment - Thanks for looking into the keywords issue, Dag. I suspect that OFFSET is a popular column name so adding it as a reserved keyword would create a fairly big compatibility issue.
          Hide
          Dag H. Wanvik added a comment - - edited

          Thanks, Rick! Uploading a new version (#3) of the code patch, which
          makes OFFSET be a non-reserved keyword (+ test for this).
          It can be used as an identifier without exceptions, due to
          a new look-ahead disambiguation of the shift/reduce conflict which arose.
          I also removed the debug code from GenericResultSetFactory; I am not sure it
          is very valuable. Fixed some small items, including Javadoc bugs as well.

          Show
          Dag H. Wanvik added a comment - - edited Thanks, Rick! Uploading a new version (#3) of the code patch, which makes OFFSET be a non-reserved keyword (+ test for this). It can be used as an identifier without exceptions, due to a new look-ahead disambiguation of the shift/reduce conflict which arose. I also removed the debug code from GenericResultSetFactory; I am not sure it is very valuable. Fixed some small items, including Javadoc bugs as well.
          Dag H. Wanvik made changes -
          Attachment derby-4079-3.diff [ 12401567 ]
          Attachment derby-4079-3.stat [ 12401568 ]
          Hide
          Dag H. Wanvik added a comment -

          Unless there are more comments on this patch, I intend to commit it in
          a few days.

          Changes to existing codepaths are few, so any errors are expected to
          mostly concern use of the new feature. So, risk should be limited, I
          hope.

          Show
          Dag H. Wanvik added a comment - Unless there are more comments on this patch, I intend to commit it in a few days. Changes to existing codepaths are few, so any errors are expected to mostly concern use of the new feature. So, risk should be limited, I hope.
          Hide
          Dag H. Wanvik added a comment -

          Committed derby-4079-3 as svn 754558.

          Show
          Dag H. Wanvik added a comment - Committed derby-4079-3 as svn 754558.
          Hide
          Dag H. Wanvik added a comment -

          Committed derby-4079-docs-2 as svn 754559.

          Show
          Dag H. Wanvik added a comment - Committed derby-4079-docs-2 as svn 754559.
          Dag H. Wanvik made changes -
          Derby Info [Patch Available, Existing Application Impact]
          Dag H. Wanvik made changes -
          Status In Progress [ 3 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          Dag H. Wanvik added a comment -

          Added the new test to lang/_Suite.java as svn 754583.

          Show
          Dag H. Wanvik added a comment - Added the new test to lang/_Suite.java as svn 754583.
          Hide
          Dag H. Wanvik added a comment -

          The test case testRuntimeStatistics in OffsetFetchNextTest parameterized newline strings
          with System.getProperty("line.separator"); this is wrong because the string returned
          from syscs_util.syscs_get_runtimestatistics() contains just "\n" also on Windows.
          Cf issue DERBY-4096.

          Fixed in patch derby-4079-fixRtStatsTest; uploaded . The test now runs on Windows also.

          Show
          Dag H. Wanvik added a comment - The test case testRuntimeStatistics in OffsetFetchNextTest parameterized newline strings with System.getProperty("line.separator"); this is wrong because the string returned from syscs_util.syscs_get_runtimestatistics() contains just "\n" also on Windows. Cf issue DERBY-4096 . Fixed in patch derby-4079-fixRtStatsTest; uploaded . The test now runs on Windows also.
          Dag H. Wanvik made changes -
          Attachment derby-4079-fixRtStatsTest.diff [ 12402336 ]
          Attachment derby-4079-fixRtStatsTest.stat [ 12402337 ]
          Dag H. Wanvik made changes -
          Link This issue is related to DERBY-4096 [ DERBY-4096 ]
          Hide
          Dag H. Wanvik added a comment -

          Committed derby-4079-fixRtStatsTest as svn 755058.

          Show
          Dag H. Wanvik added a comment - Committed derby-4079-fixRtStatsTest as svn 755058.
          Hide
          Dag H. Wanvik added a comment -

          Backported derby-4079-fixRtStatsTest to 10.5 branch as svn 755070.

          Show
          Dag H. Wanvik added a comment - Backported derby-4079-fixRtStatsTest to 10.5 branch as svn 755070.
          Hide
          Kim Haase added a comment -

          I'm sorry, I should have caught this. I find that when I process one of the new files that are part of the doc patch, I get the following DITA syntax error:

          [pipeline] [Error] rrefsqljoffsetfetch.dita:81:10: The content of element type "refsyn" must match "(ph|codeph|synph|filepath|msgph|userinput|systemoutput|b|u|i|tt|sup|sub|uicontrol|menucascade|term|xref|cite|q|boolean|state|keyword|option|parmname|apiname|cmdname|msgnum|varname|wintitle|tm|p|lq|note|dl|parml|ul|ol|sl|pre|codeblock|msgblock|screen|lines|fig|syntaxdiagram|imagemap|image|object|table|simpletable|title|draft-comment|required-cleanup|fn|indextermref|indexterm)".

          Processing the file gives a BUILD SUCCESSFUL message at the end,, and the generated output looks fine, so it's easy to miss this. The error shows up at the start of processing.

          It turns out that the closing "</refsyn>" tag is placed after the example. It needs to come before it.

          I'll attach a suggested patch.

          Show
          Kim Haase added a comment - I'm sorry, I should have caught this. I find that when I process one of the new files that are part of the doc patch, I get the following DITA syntax error: [pipeline] [Error] rrefsqljoffsetfetch.dita:81:10: The content of element type "refsyn" must match "(ph|codeph|synph|filepath|msgph|userinput|systemoutput|b|u|i|tt|sup|sub|uicontrol|menucascade|term|xref|cite|q|boolean|state|keyword|option|parmname|apiname|cmdname|msgnum|varname|wintitle|tm|p|lq|note|dl|parml|ul|ol|sl|pre|codeblock|msgblock|screen|lines|fig|syntaxdiagram|imagemap|image|object|table|simpletable|title|draft-comment|required-cleanup|fn|indextermref|indexterm)". Processing the file gives a BUILD SUCCESSFUL message at the end,, and the generated output looks fine, so it's easy to miss this. The error shows up at the start of processing. It turns out that the closing "</refsyn>" tag is placed after the example. It needs to come before it. I'll attach a suggested patch.
          Kim Haase made changes -
          Resolution Fixed [ 1 ]
          Status Resolved [ 5 ] Reopened [ 4 ]
          Hide
          Kim Haase added a comment -

          Attaching patch to fix the DITA syntax error in the new reference topic: DERBY-4079-docs-2.diff. Also attaching HTML file, though it looks the same as the old one. Dag, please commit if you see fit.

          Show
          Kim Haase added a comment - Attaching patch to fix the DITA syntax error in the new reference topic: DERBY-4079 -docs-2.diff. Also attaching HTML file, though it looks the same as the old one. Dag, please commit if you see fit.
          Kim Haase made changes -
          Attachment DERBY-4079-docs-2.diff [ 12402671 ]
          Attachment rrefsqljoffsetfetch.html [ 12402672 ]
          Kim Haase made changes -
          Derby Info [Patch Available]
          Hide
          Dag H. Wanvik added a comment -

          Thanks for the fix, Kim! Committed as svn 756695, resolving.

          Show
          Dag H. Wanvik added a comment - Thanks for the fix, Kim! Committed as svn 756695, resolving.
          Dag H. Wanvik made changes -
          Status Reopened [ 4 ] Resolved [ 5 ]
          Derby Info [Patch Available]
          Resolution Fixed [ 1 ]
          Hide
          Myrna van Lunteren added a comment -

          that last patch still needs to get backported to 10.5 docs branch, right?

          Show
          Myrna van Lunteren added a comment - that last patch still needs to get backported to 10.5 docs branch, right?
          Hide
          Dag H. Wanvik added a comment -

          Yes, I'm in the process of doing it now, Myrna! Thanks for making sure

          Show
          Dag H. Wanvik added a comment - Yes, I'm in the process of doing it now, Myrna! Thanks for making sure
          Hide
          Dag H. Wanvik added a comment -

          Backported to 10.5 branch as svn 756713.

          Show
          Dag H. Wanvik added a comment - Backported to 10.5 branch as svn 756713.
          Myrna van Lunteren made changes -
          Fix Version/s 10.5.1.1 [ 12313771 ]
          Fix Version/s 10.5.0.0 [ 12313010 ]
          Dag H. Wanvik made changes -
          Issue Type New Feature [ 2 ] Improvement [ 4 ]
          Dag H. Wanvik made changes -
          Link This issue relates to DERBY-4208 [ DERBY-4208 ]
          Bryan Pendleton made changes -
          Link This issue relates to DERBY-4562 [ DERBY-4562 ]
          Dag H. Wanvik made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Gavin made changes -
          Workflow jira [ 12453800 ] Default workflow, editable Closed status [ 12799624 ]

            People

            • Assignee:
              Dag H. Wanvik
              Reporter:
              Dag H. Wanvik
            • Votes:
              1 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development