Derby
  1. Derby
  2. DERBY-4518

Document ORDER BY and OFFSET/FETCH in subqueries

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 10.6.1.0
    • Component/s: Documentation
    • Labels:
      None

      Description

      DERBY-4397 and DERBY-4398 extends the usability of ORDER BY and OFFSET/FETCH to subqueries and view. This should be documented.

      1. DERBY-4518.diff
        21 kB
        Kim Haase
      2. DERBY-4518.stat
        0.4 kB
        Kim Haase
      3. DERBY-4518.zip
        30 kB
        Kim Haase
      4. DERBY-4518-2.diff
        23 kB
        Kim Haase
      5. DERBY-4518-2.zip
        30 kB
        Kim Haase
      6. DERBY-4518-3.diff
        23 kB
        Kim Haase
      7. DERBY-4518-3.zip
        30 kB
        Kim Haase

        Issue Links

          Activity

          Hide
          Dag H. Wanvik added a comment -

          The specification attachment to DERBY-4397 contains updated syntax descriptions for the Derby documentation
          and notes on where the docs need updating.

          Show
          Dag H. Wanvik added a comment - The specification attachment to DERBY-4397 contains updated syntax descriptions for the Derby documentation and notes on where the docs need updating.
          Hide
          Kim Haase added a comment -

          The spec is very clear! I do have a couple of questions about the Documentation section.

          The "SELECT statement" topic already includes the ORDER BY clause. Are other changes needed? I would guess that the "Requirements for updatable cursors and updatable ResultSets" section would not change (that is, ORDER BY is still not allowed).

          What changes are needed for the "SelectExpression" topic? Can the FROM clause now be followed by ORDER BY, result offset, and fetch first clauses?

          The "INSERT statement" topic refers to both a "VALUES list" and a "multiple-row VALUES expression". I think that after some discussion a few years ago it was agreed to use the term "VALUES expression" generally rather than "VALUES clause" or "VALUES list". Would it be okay to combine these into "a single-row or multiple-row values expression"?

          Show
          Kim Haase added a comment - The spec is very clear! I do have a couple of questions about the Documentation section. The "SELECT statement" topic already includes the ORDER BY clause. Are other changes needed? I would guess that the "Requirements for updatable cursors and updatable ResultSets" section would not change (that is, ORDER BY is still not allowed). What changes are needed for the "SelectExpression" topic? Can the FROM clause now be followed by ORDER BY, result offset, and fetch first clauses? The "INSERT statement" topic refers to both a "VALUES list" and a "multiple-row VALUES expression". I think that after some discussion a few years ago it was agreed to use the term "VALUES expression" generally rather than "VALUES clause" or "VALUES list". Would it be okay to combine these into "a single-row or multiple-row values expression"?
          Hide
          Kim Haase added a comment -

          A couple of other items –

          The Documentation section doesn't mention CREATE VIEW, but the "Derby syntax changes" section does, so I'm modifying that topic too. Hope that's okay.

          For the "ORDER BY clause" topic, I am guessing that the definition at the beginning needs some expansion? Currently it says, "The ORDER BY clause is an optional element of a SELECT statement." Would it be correct to say the following?

          "The ORDER BY clause is an optional element of a SELECT statement or a query expression."

          Show
          Kim Haase added a comment - A couple of other items – The Documentation section doesn't mention CREATE VIEW, but the "Derby syntax changes" section does, so I'm modifying that topic too. Hope that's okay. For the "ORDER BY clause" topic, I am guessing that the definition at the beginning needs some expansion? Currently it says, "The ORDER BY clause is an optional element of a SELECT statement." Would it be correct to say the following? "The ORDER BY clause is an optional element of a SELECT statement or a query expression."
          Hide
          Dag H. Wanvik added a comment -

          Thanks for picking up this one, Kim! Answers to your questions:

          I think the SELECT statement section is OK as it stands. As you say,
          the requirements for an updatable cursor do not change.

          SelectExpression topic: Yes, I suggest you put it there too for good
          measure, and in the VALUES expression topic as well. The documentation
          structure here is not good in that it deviates significantly from the
          language, probably to make explanations "more intuitive", so it's hard
          to make this non-redundant and also correct..

          "Would it be okay to combine these into "a single-row or multiple-row
          values expression"? I think so, yes.

          "The Documentation section doesn't mention CREATE VIEW, but the "Derby
          syntax changes" section does", so I'm modifying that topic too."

          That's an omission, thanks for catching that.

          ORDER BY clause topic: I suggest this wording (or some such, feel
          free to improve!):

          "The ORDER BY clause is an optional element of a SELECT statement (link), the
          SELECT expression (link), the VALUES expression (link) and in table and scalar subqueries (links). It can
          also be used with an INSERT statement (link) and the CREATE VIEW statement (link).

          An ORDER BY clause allows you to specify the order in which rows
          appear in the result set. In subqueries, the ORDER BY clause is
          meaningless unless accompanied by one or both of the result offset and
          fetch first clauses (link) or in conjunction with ROW_NUMBER (link)
          since there is no guarantee the order is retained in the outer result
          set. It is allowed to combine ORDER BY on the outer query with ORDER
          by in subqueries."

          Show
          Dag H. Wanvik added a comment - Thanks for picking up this one, Kim! Answers to your questions: I think the SELECT statement section is OK as it stands. As you say, the requirements for an updatable cursor do not change. SelectExpression topic: Yes, I suggest you put it there too for good measure, and in the VALUES expression topic as well. The documentation structure here is not good in that it deviates significantly from the language, probably to make explanations "more intuitive", so it's hard to make this non-redundant and also correct.. "Would it be okay to combine these into "a single-row or multiple-row values expression"? I think so, yes. "The Documentation section doesn't mention CREATE VIEW, but the "Derby syntax changes" section does", so I'm modifying that topic too." That's an omission, thanks for catching that. ORDER BY clause topic: I suggest this wording (or some such, feel free to improve!): "The ORDER BY clause is an optional element of a SELECT statement (link), the SELECT expression (link), the VALUES expression (link) and in table and scalar subqueries (links). It can also be used with an INSERT statement (link) and the CREATE VIEW statement (link). An ORDER BY clause allows you to specify the order in which rows appear in the result set. In subqueries, the ORDER BY clause is meaningless unless accompanied by one or both of the result offset and fetch first clauses (link) or in conjunction with ROW_NUMBER (link) since there is no guarantee the order is retained in the outer result set. It is allowed to combine ORDER BY on the outer query with ORDER by in subqueries."
          Hide
          Kim Haase added a comment -

          Thanks very much, Dag, I'm starting on these.

          After some experimenting I am finding that I get a syntax error when I use the offset clause in a VALUES expression, although the order by and fetch first clauses work fine. Is this expected?

          ij> values (3,21),(1,22),(2,23);
          1 |2
          -----------------------
          3 |21
          1 |22
          2 |23

          3 rows selected
          ij> values (3,21),(1,22),(2,23) order by 1;
          1 |2
          -----------------------
          1 |22
          2 |23
          3 |21

          3 rows selected
          ij> values (3,21),(1,22),(2,23) order by 2;
          1 |2
          -----------------------
          3 |21
          1 |22
          2 |23

          3 rows selected
          ij> values (3,21),(1,22),(2,23) fetch first 2 rows only;
          1 |2
          -----------------------
          3 |21
          1 |22

          2 rows selected
          ij> values (3,21),(1,22),(2,23) order by 1 fetch first 2 rows only;
          1 |2
          -----------------------
          1 |22
          2 |23

          2 rows selected
          ij> values (3,21),(1,22),(2,23) order by 1 offset 2;
          ERROR 42X01: Syntax error: Encountered "<EOF>" at line 1, column 47.
          Issue the 'help' command for general information on IJ command syntax.
          Any unrecognized commands are treated as potential SQL commands and executed directly.
          Consult your DBMS server reference documentation for details of the SQL syntax supported by your server.

          Show
          Kim Haase added a comment - Thanks very much, Dag, I'm starting on these. After some experimenting I am finding that I get a syntax error when I use the offset clause in a VALUES expression, although the order by and fetch first clauses work fine. Is this expected? ij> values (3,21),(1,22),(2,23); 1 |2 ----------------------- 3 |21 1 |22 2 |23 3 rows selected ij> values (3,21),(1,22),(2,23) order by 1; 1 |2 ----------------------- 1 |22 2 |23 3 |21 3 rows selected ij> values (3,21),(1,22),(2,23) order by 2; 1 |2 ----------------------- 3 |21 1 |22 2 |23 3 rows selected ij> values (3,21),(1,22),(2,23) fetch first 2 rows only; 1 |2 ----------------------- 3 |21 1 |22 2 rows selected ij> values (3,21),(1,22),(2,23) order by 1 fetch first 2 rows only; 1 |2 ----------------------- 1 |22 2 |23 2 rows selected ij> values (3,21),(1,22),(2,23) order by 1 offset 2; ERROR 42X01: Syntax error: Encountered "<EOF>" at line 1, column 47. Issue the 'help' command for general information on IJ command syntax. Any unrecognized commands are treated as potential SQL commands and executed directly. Consult your DBMS server reference documentation for details of the SQL syntax supported by your server.
          Hide
          Kim Haase added a comment -

          BTW, I am using a version of derbyrun.jar that I built today from the code trunk.

          Show
          Kim Haase added a comment - BTW, I am using a version of derbyrun.jar that I built today from the code trunk.
          Hide
          Dag H. Wanvik added a comment -

          The last statement's syntax is wrong: it should be:
          values (3,21),(1,22),(2,23) order by 1 offset 2 rows;

          Thats is, the "rows" keyword is needed.

          Show
          Dag H. Wanvik added a comment - The last statement's syntax is wrong: it should be: values (3,21),(1,22),(2,23) order by 1 offset 2 rows; Thats is, the "rows" keyword is needed.
          Hide
          Kim Haase added a comment -

          Thanks for straightening me out on the offset syntax, Dag.

          Attaching DERBY-4518.diff, DERBY-4518.stat, and DERBY-4518.zip, a preliminary patch, with changes to the following 11 topics:

          Reference Manual:

          CREATE VIEW statement: rrefsqlj15446.dita
          INSERT statement: rrefsqlj40774.dita
          ORDER BY clause: rrefsqlj13658.dita
          SelectExpression: rrefselectexpression.dita
          VALUES expression: rrefsqlj11277.dita
          Query: rrefsqlj21571.dita
          ScalarSubquery: rrefscalarsubquery.dita
          TableSubquery: rreftablesubquery.dita

          I took advantage of the opportunity for some formatting cleanup in some of the topics. I added examples for some. I believe some topics need more information, particularly SelectExpression, since it describes what all the clauses do. I'd like to know if what I've done so far is on the right track, though. Thanks!

          Tuning Guide:

          View flattening: ctuntransform22576.dita
          Flattening a subquery into a normal join: ctuntransform36368.dita
          Flattening a subquery into an EXISTS join: ctuntransform25868.dita

          Show
          Kim Haase added a comment - Thanks for straightening me out on the offset syntax, Dag. Attaching DERBY-4518 .diff, DERBY-4518 .stat, and DERBY-4518 .zip, a preliminary patch, with changes to the following 11 topics: Reference Manual: CREATE VIEW statement: rrefsqlj15446.dita INSERT statement: rrefsqlj40774.dita ORDER BY clause: rrefsqlj13658.dita SelectExpression: rrefselectexpression.dita VALUES expression: rrefsqlj11277.dita Query: rrefsqlj21571.dita ScalarSubquery: rrefscalarsubquery.dita TableSubquery: rreftablesubquery.dita I took advantage of the opportunity for some formatting cleanup in some of the topics. I added examples for some. I believe some topics need more information, particularly SelectExpression, since it describes what all the clauses do. I'd like to know if what I've done so far is on the right track, though. Thanks! Tuning Guide: View flattening: ctuntransform22576.dita Flattening a subquery into a normal join: ctuntransform36368.dita Flattening a subquery into an EXISTS join: ctuntransform25868.dita
          Hide
          Dag H. Wanvik added a comment -

          Thanks for the patch, Kim! It all looks good to me.

          I have one comment, which was not introducd by this patch but it relevant all the same:

          rrefsqlj11277.html

          "A VALUES expression that is used in an INSERT statement cannot use an
          ORDER BY, result offset, or fetch first clause."

          While correct, this statement isn't very helpful, I'd like it to say
          something like this:

          "A VALUES expression that is used in an INSERT statement cannot use an
          ORDER BY, result offset, or fetch first clause."

          "However, if the VALUES expression does not contain the DEFAULT keyword the
          VALUES clause can be put in a subquery and ordered, e.g. like this:

          INSERT INTO t SELECT * FROM (VALUES 'a','c','b') t ORDER BY 1;
          "

          Ditto i rrefsqlj40774.html (insert statement)

          Show
          Dag H. Wanvik added a comment - Thanks for the patch, Kim! It all looks good to me. I have one comment, which was not introducd by this patch but it relevant all the same: rrefsqlj11277.html "A VALUES expression that is used in an INSERT statement cannot use an ORDER BY, result offset, or fetch first clause." While correct, this statement isn't very helpful, I'd like it to say something like this: "A VALUES expression that is used in an INSERT statement cannot use an ORDER BY, result offset, or fetch first clause." "However, if the VALUES expression does not contain the DEFAULT keyword the VALUES clause can be put in a subquery and ordered, e.g. like this: INSERT INTO t SELECT * FROM (VALUES 'a','c','b') t ORDER BY 1; " Ditto i rrefsqlj40774.html (insert statement)
          Hide
          Dag H. Wanvik added a comment -

          One more thing:
          In this example:

          – Create an AIRPORTS table and insert into it
          – some of the fields from the CITIES table, with the airport
          – codes sorted alphabetically
          CREATE TABLE AIRPORTS (
          AIRPORT VARCHAR(3),
          CITY VARCHAR(24) NOT NULL,
          COUNTRY VARCHAR(26) NOT NULL
          );

          INSERT INTO AIRPORTS
          SELECT AIRPORT, CITY_NAME, COUNTRY FROM CITIES
          ORDER BY AIRPORT;

          it would probably be prudent to include an auto-generated key (cf. DERBY-4), since having rows physically sorted doesn't really help much after the insert unless it's somehow related to some other property that Derby knows about: one would still need to specify ordering when selecting to be assured of the correct order, at least for portability. I think it some verbiage on this is in order as well, e.g. "When you want insertion to happen with a specific ordering, e.g. in conjunction with auto-generated keys, it can be useful to specify an ORDER BY on the result set to be inserted."

          Show
          Dag H. Wanvik added a comment - One more thing: In this example: – Create an AIRPORTS table and insert into it – some of the fields from the CITIES table, with the airport – codes sorted alphabetically CREATE TABLE AIRPORTS ( AIRPORT VARCHAR(3), CITY VARCHAR(24) NOT NULL, COUNTRY VARCHAR(26) NOT NULL ); INSERT INTO AIRPORTS SELECT AIRPORT, CITY_NAME, COUNTRY FROM CITIES ORDER BY AIRPORT; it would probably be prudent to include an auto-generated key (cf. DERBY-4 ), since having rows physically sorted doesn't really help much after the insert unless it's somehow related to some other property that Derby knows about: one would still need to specify ordering when selecting to be assured of the correct order, at least for portability. I think it some verbiage on this is in order as well, e.g. "When you want insertion to happen with a specific ordering, e.g. in conjunction with auto-generated keys, it can be useful to specify an ORDER BY on the result set to be inserted."
          Hide
          Kim Haase added a comment -

          Thanks, Dag! I am incorporating your suggestions.

          I do wonder if I should say more in rrefselectexpression.html. I can add descriptions of each of the added clauses, but I am not sure where to put them in the list that follows "Derby processes a SelectExpression in the following order:".

          Show
          Kim Haase added a comment - Thanks, Dag! I am incorporating your suggestions. I do wonder if I should say more in rrefselectexpression.html. I can add descriptions of each of the added clauses, but I am not sure where to put them in the list that follows "Derby processes a SelectExpression in the following order:".
          Hide
          Dag H. Wanvik added a comment -

          The right order would be:

          Derby processes a SelectExpression in the following order:

          • FROM clause
          • WHERE clause
          • GROUP BY (or implicit GROUP BY)
          • HAVING clause
          • ORDER BY clause
          • result offset clause
          • fetch first clause
          • SELECT clause
          Show
          Dag H. Wanvik added a comment - The right order would be: Derby processes a SelectExpression in the following order: FROM clause WHERE clause GROUP BY (or implicit GROUP BY) HAVING clause ORDER BY clause result offset clause fetch first clause SELECT clause
          Hide
          Kim Haase added a comment -

          Thanks, Dag! I'm attaching DERBY-4518-2.diff and DERBY-4518-2.zip. I hope they address all your comments – please let me know if further changes are needed.

          Show
          Kim Haase added a comment - Thanks, Dag! I'm attaching DERBY-4518 -2.diff and DERBY-4518 -2.zip. I hope they address all your comments – please let me know if further changes are needed.
          Hide
          Dag H. Wanvik added a comment -

          Thanks so much for the quick action of this and the latest rev of this patch, Kim! It looks great now, just one small thing:

          In the INSERT topic, I think I'd want the sentence starting "When you want insertion to happen with a specific ordering.." moved up before the paragraph starting "If the Query is a VALUES expression..",
          since the former is really a use case/rationale for using ORDER BY in the first place with INSERT, whereas the other paragraph describes a corner case of its usage.

          Show
          Dag H. Wanvik added a comment - Thanks so much for the quick action of this and the latest rev of this patch, Kim! It looks great now, just one small thing: In the INSERT topic, I think I'd want the sentence starting "When you want insertion to happen with a specific ordering.." moved up before the paragraph starting "If the Query is a VALUES expression..", since the former is really a use case/rationale for using ORDER BY in the first place with INSERT, whereas the other paragraph describes a corner case of its usage.
          Hide
          Kim Haase added a comment -

          Thanks, Dag! I've incorporated that change into DERBY-4518-3.diff and DERBY-4518-3.zip. I will commit this patch.

          Show
          Kim Haase added a comment - Thanks, Dag! I've incorporated that change into DERBY-4518 -3.diff and DERBY-4518 -3.zip. I will commit this patch.
          Hide
          Kim Haase added a comment -

          Committed patch DERBY-4518-3.diff to documentation trunk at revision 904155.

          Show
          Kim Haase added a comment - Committed patch DERBY-4518 -3.diff to documentation trunk at revision 904155.
          Hide
          Tiago R. Espinha added a comment -

          Sorry to be revisiting this issue but I was doing buddy-testing for this feature and I was a bit stumped with the documentation. I couldn't find an example under "The result offset and fetch first clauses" that demonstrates the use of OFFSET within sub-queries. There is an example under "Duplicates in UNION, INTERSECT, and EXCEPT ALL results" but at least to me, that wouldn't be the first logical place I'd visit to find examples of OFFSET usage on sub-queries.

          In my opinion this is something small and not worth a respin but maybe we can refine it for further releases?

          Show
          Tiago R. Espinha added a comment - Sorry to be revisiting this issue but I was doing buddy-testing for this feature and I was a bit stumped with the documentation. I couldn't find an example under "The result offset and fetch first clauses" that demonstrates the use of OFFSET within sub-queries. There is an example under "Duplicates in UNION, INTERSECT, and EXCEPT ALL results" but at least to me, that wouldn't be the first logical place I'd visit to find examples of OFFSET usage on sub-queries. In my opinion this is something small and not worth a respin but maybe we can refine it for further releases?
          Hide
          Dag H. Wanvik added a comment -

          I agree this can be improved, Tiago! Thanks for spotting this.

          Show
          Dag H. Wanvik added a comment - I agree this can be improved, Tiago! Thanks for spotting this.
          Hide
          Tiago R. Espinha added a comment -

          I seem to have run into another problem with my buddy testing. On DERBY-4398 the following is mentioned:

          "F860 dynamic <fetch first row count> in <fetch first clause> (already present in 10.5) "

          How does this dynamic fetching work? I've tried:

          select * from t1 order by f1 desc fetch first count only;

          and

          select f1, count as cnt from t1 fetch first cnt only;

          But I always got an error. Am I misunderstanding that feature?

          Show
          Tiago R. Espinha added a comment - I seem to have run into another problem with my buddy testing. On DERBY-4398 the following is mentioned: "F860 dynamic <fetch first row count> in <fetch first clause> (already present in 10.5) " How does this dynamic fetching work? I've tried: select * from t1 order by f1 desc fetch first count only; and select f1, count as cnt from t1 fetch first cnt only; But I always got an error. Am I misunderstanding that feature?
          Hide
          Dag H. Wanvik added a comment -

          Yes, dynamic here means "?" markers in prepared statements in stead of integer literals, e.g. "offset ? rows fetch first ? rows only".

          Show
          Dag H. Wanvik added a comment - Yes, dynamic here means "?" markers in prepared statements in stead of integer literals, e.g. "offset ? rows fetch first ? rows only".
          Hide
          Myrna van Lunteren added a comment -

          This functionality went in into 10.6; I wonder if it wouldn't be cleaner to close this again and perhaps any further adjustments should be headed under a separate bug?

          Show
          Myrna van Lunteren added a comment - This functionality went in into 10.6; I wonder if it wouldn't be cleaner to close this again and perhaps any further adjustments should be headed under a separate bug?
          Hide
          Dag H. Wanvik added a comment -

          Closing this as suggested, opening a new JIRA for the improvement, cf. link.

          Show
          Dag H. Wanvik added a comment - Closing this as suggested, opening a new JIRA for the improvement, cf. link.

            People

            • Assignee:
              Kim Haase
              Reporter:
              Dag H. Wanvik
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development