Derby
  1. Derby
  2. DERBY-4

"order by" is not supported for "insert ... select"

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: SQL
    • Labels:
      None

      Description

      When filling a table with "insert ... select ...", "order by" cannot be specified.

      There is no method to copy a table sorted into another table (except using export/import). This would be useful to optimize performance for big tables, or to create identity values that are ascending (related to another column).

      Example:

      create table temp1 (
      s varchar(10));

      insert into temp1 values 'x','a','c','b','a';

      create table temp2 (
      i integer not null
      generated always as identity
      primary key,
      s varchar(10));

      insert into temp2 (s)
      select s from temp1 order by s;

      --> Error: "order by" is not allowed.

      – trying to use "group by" instead of "oder by":

      insert into temp2 (s)
      select s from temp1 group by s;
      select * from temp2;

      --> "group by" did not sort the table.

      1. derby-4_dhw.stat
        1 kB
        Dag H. Wanvik
      2. derby-4_dhw.diff
        38 kB
        Dag H. Wanvik
      3. samples.ij
        2 kB
        Bryan Pendleton
      4. insertOrderBy_v3.diff
        9 kB
        Bryan Pendleton
      5. samples.ij
        1 kB
        Bryan Pendleton
      6. insertOrderBy_v2.diff
        6 kB
        Bryan Pendleton
      7. insertOrderBy.diff
        4 kB
        Bryan Pendleton

        Issue Links

          Activity

          Hide
          Jeff Lichtman added a comment -

          I see what you want to do here: you are trying to number the rows in a table according to an ordering.

          Standard SQL doesn't allow the "order by" clause on an insert-select. Normally, this makes sense because a table (AKA relation) is an unordered set of rows (AKA tuples). Normally, the order that the rows are inserted would have no consequence that one should rely on. For example, if you select the rows from the table you just inserted them into, there is no guarantee that they would come back in the same order.

          In this case, though, you want to rely on a side-effect of inserting (the generation of indentity values) as the insert happens. This is a valid thing to want to do - however, it makes me nervous to make a broad extension for such a specific purpose.

          If the Virtual Table Interface (VTI) still existed (it was in Cloudscape 5.1 and earlier versions), I would suggest using that as a workaround. VTIs allowed the user to write a Java class with methods that returned ResultSets. One could easily hide the "order by" clause inside the VTI.

          I don't think "group by" is a valid workaround. A select statement with a "group by" and no aggregates in the select list is the same as a "select distinct." No ordering is guaranteed - for example, a smart optimizer could avoid a sort if there were a unique index on a subset of the result columns.

          Show
          Jeff Lichtman added a comment - I see what you want to do here: you are trying to number the rows in a table according to an ordering. Standard SQL doesn't allow the "order by" clause on an insert-select. Normally, this makes sense because a table (AKA relation) is an unordered set of rows (AKA tuples). Normally, the order that the rows are inserted would have no consequence that one should rely on. For example, if you select the rows from the table you just inserted them into, there is no guarantee that they would come back in the same order. In this case, though, you want to rely on a side-effect of inserting (the generation of indentity values) as the insert happens. This is a valid thing to want to do - however, it makes me nervous to make a broad extension for such a specific purpose. If the Virtual Table Interface (VTI) still existed (it was in Cloudscape 5.1 and earlier versions), I would suggest using that as a workaround. VTIs allowed the user to write a Java class with methods that returned ResultSets. One could easily hide the "order by" clause inside the VTI. I don't think "group by" is a valid workaround. A select statement with a "group by" and no aggregates in the select list is the same as a "select distinct." No ordering is guaranteed - for example, a smart optimizer could avoid a sort if there were a unique index on a subset of the result columns.
          Hide
          Christian d'Heureuse added a comment -

          Thanks for your explanations.

          For the ordered numbering I could write a stored procedure that inserts the data row for row.

          What do you think about a (system) stored procedure to re-load a table, so that the records are physically sorted by the primary key? Maybe it could be an extension to SYSCS_UTIL.SYSCS_COMPRESS_TABLE.
          I assume that this would not solve the "insert select" problem, because there is no guarantee that the "physical" order would be used to insert the rows. But it would be useful to optimize disk access for large tables, e.g. journal records for adresses could be physically grouped together (for each address). Otherwise one has to define an index that includes all columns that are accessed.
          Another solution would be to implement clustering indexes.

          Show
          Christian d'Heureuse added a comment - Thanks for your explanations. For the ordered numbering I could write a stored procedure that inserts the data row for row. What do you think about a (system) stored procedure to re-load a table, so that the records are physically sorted by the primary key? Maybe it could be an extension to SYSCS_UTIL.SYSCS_COMPRESS_TABLE. I assume that this would not solve the "insert select" problem, because there is no guarantee that the "physical" order would be used to insert the rows. But it would be useful to optimize disk access for large tables, e.g. journal records for adresses could be physically grouped together (for each address). Otherwise one has to define an index that includes all columns that are accessed. Another solution would be to implement clustering indexes.
          Hide
          Bryan Pendleton added a comment -

          I am interested in adding this feature to Derby; I find that I have the
          same need that Christian identified.

          I understand the concerns that Jeff raised about this being a
          non-standard language extension. However, Christian's usage
          seems quite reasonable to me (and in fact matches my usage),
          and this extension is commonly supported by many other databases
          so we are not really breaking new ground here.

          I've been working on a patch to add this support and hope to post it
          soon for discussion, once I've seen how the testing goes.

          Show
          Bryan Pendleton added a comment - I am interested in adding this feature to Derby; I find that I have the same need that Christian identified. I understand the concerns that Jeff raised about this being a non-standard language extension. However, Christian's usage seems quite reasonable to me (and in fact matches my usage), and this extension is commonly supported by many other databases so we are not really breaking new ground here. I've been working on a patch to add this support and hope to post it soon for discussion, once I've seen how the testing goes.
          Hide
          Christian d'Heureuse added a comment -

          Bryan, thanks for planning to implement the INSERT ... SELECT ... ORDER BY.
          An alternative would be to implement a ROW_NUMBER ranking function, which would be more conforming to the SQL standard:

          SELECT ROW_NUMBER() OVER (ORDER BY s) as i, s FROM temp1;

          Show
          Christian d'Heureuse added a comment - Bryan, thanks for planning to implement the INSERT ... SELECT ... ORDER BY. An alternative would be to implement a ROW_NUMBER ranking function, which would be more conforming to the SQL standard: SELECT ROW_NUMBER() OVER (ORDER BY s) as i, s FROM temp1;
          Hide
          Daniel John Debrunner added a comment -

          Would the use of ORDER BY guarantee anything or is it just something that might work?

          I don't think it can guarantee contiguous values since other INSERTS can occur from other transactions that would use generated identity values.

          Show
          Daniel John Debrunner added a comment - Would the use of ORDER BY guarantee anything or is it just something that might work? I don't think it can guarantee contiguous values since other INSERTS can occur from other transactions that would use generated identity values.
          Hide
          Bryan Pendleton added a comment -

          Thanks Christian and Dan for the feedback. I agree about the contiguous
          values, Dan. The guarantee that I think it can make is that the values are
          monotonically increasing; that is, that the rows that are inserted, if later
          sorted by their auto-generated field, will be in the same order as they came
          from the SELECT. So it's preserving the order, which is all that is needed
          for my application (I'm using a "temporary" table for supporting paging in a UI,
          and wish to preserve the the order of the rows as they arrive in that table.)

          Christian, regarding the ranking functions, I'm quite interested in those, and
          I think we should log them as an enhancement request and pursue them.
          Perhaps you have already done so? If not I think that would be great.
          However, in my particular case I don't have control over the SQL that is
          being emitted; I'm trying to get Derby to work with a tool written for another DBMS.

          Show
          Bryan Pendleton added a comment - Thanks Christian and Dan for the feedback. I agree about the contiguous values, Dan. The guarantee that I think it can make is that the values are monotonically increasing; that is, that the rows that are inserted, if later sorted by their auto-generated field, will be in the same order as they came from the SELECT. So it's preserving the order, which is all that is needed for my application (I'm using a "temporary" table for supporting paging in a UI, and wish to preserve the the order of the rows as they arrive in that table.) Christian, regarding the ranking functions, I'm quite interested in those, and I think we should log them as an enhancement request and pursue them. Perhaps you have already done so? If not I think that would be great. However, in my particular case I don't have control over the SQL that is being emitted; I'm trying to get Derby to work with a tool written for another DBMS.
          Hide
          Christian d'Heureuse added a comment -

          Yes, there is already a JIRA entry for "SQL pagination" with the ROW_NUMBER ranking function: DERBY-581

          Show
          Christian d'Heureuse added a comment - Yes, there is already a JIRA entry for "SQL pagination" with the ROW_NUMBER ranking function: DERBY-581
          Hide
          Bernt M. Johnsen added a comment -

          I do not agree that we should implement this "feature". First of all, it is not defined in the standard, and secondly it makes no sense wrt. SQL as a language. SQL is basically a set language where the rows have no order. The ORDER BY clause is used to order the rows when they are retrieved from the database.

          Although the following SQL:
          INSERT INTO T VALUES(1);
          INSERT INTO T VALUES(2);
          SELECT * FROM T;

          In most databases will give you the result
          1
          2
          But I've seen otherwise, and that's completely compliant with the standard. I would say that any application based on the assumption that the INSERT order has any influence on the result from a SELECT statement withou an ORDER BY clause is buggy, so the tool Bryan is referring to should be fixed.

          -1 To this "feature".

          Show
          Bernt M. Johnsen added a comment - I do not agree that we should implement this "feature". First of all, it is not defined in the standard, and secondly it makes no sense wrt. SQL as a language. SQL is basically a set language where the rows have no order. The ORDER BY clause is used to order the rows when they are retrieved from the database. Although the following SQL: INSERT INTO T VALUES(1); INSERT INTO T VALUES(2); SELECT * FROM T; In most databases will give you the result 1 2 But I've seen otherwise, and that's completely compliant with the standard. I would say that any application based on the assumption that the INSERT order has any influence on the result from a SELECT statement withou an ORDER BY clause is buggy, so the tool Bryan is referring to should be fixed. -1 To this "feature".
          Hide
          Bryan Pendleton added a comment -

          Hi Bernt, thanks for the feedback. I believe I understand your concerns.

          I am not proposing that the ORDER BY should have any affect on
          subsequent SELECT statements. Such a behavior would be indeed wrong.

          I am only proposing that the ORDER BY should affect the SELECT that is
          part of the INSERT statement itself; that is, that it should affect the order in
          which the INSERT statement processes the rows that the SELECT selects.

          Note that in Christian's example (and in the tool I'm trying to support), there
          is an auto-generated column. The database will automatically generate
          new values for this column as the rows are inserted.

          The point of this feature is to allow the user to ensure that, *if the rows are
          later selected and ordered by the auto-generated column*, then they will
          appear in the same order as they appeared during the INSERT.

          Thus subsequent SELECT statements do have an ORDER BY, but it
          is an ORDER BY on the autogenerated column, not on the original columns
          of the source data for the INSERT ... SELECT ... ORDER BY.

          Does this explanation make more sense?

          Show
          Bryan Pendleton added a comment - Hi Bernt, thanks for the feedback. I believe I understand your concerns. I am not proposing that the ORDER BY should have any affect on subsequent SELECT statements. Such a behavior would be indeed wrong. I am only proposing that the ORDER BY should affect the SELECT that is part of the INSERT statement itself; that is, that it should affect the order in which the INSERT statement processes the rows that the SELECT selects. Note that in Christian's example (and in the tool I'm trying to support), there is an auto-generated column. The database will automatically generate new values for this column as the rows are inserted. The point of this feature is to allow the user to ensure that, *if the rows are later selected and ordered by the auto-generated column*, then they will appear in the same order as they appeared during the INSERT. Thus subsequent SELECT statements do have an ORDER BY, but it is an ORDER BY on the autogenerated column, not on the original columns of the source data for the INSERT ... SELECT ... ORDER BY. Does this explanation make more sense?
          Hide
          Bryan Pendleton added a comment -

          Attached is 'insertOrderBy.diff', a patch containing code changes,
          but no new tests. This patch is not intended for commit, just as
          a way to give us some concrete code to talk about.

          The patch adds a line to the parser to enable INSERT ... SELECT .. ORDER BY
          and adds a bit of code to InsertNode to handle the OrderByList.

          The code in InsertNode mimics the implementation in CursorNode;
          perhaps there is some way to arrange for these two nodes to
          share this code?

          With the patch, derbyall and suites.All run clean, so I've made a first
          attempt at verifying that this change doesn't break existing behaviors.
          Also, I verified that a similar change to 10.2 can be made, if desired,
          and also adds the new behavior without (obviously) breaking existing ones.

          I'd love to keep the discussion going: either more general comments
          on the underlying concepts of the feature, or specific comments about
          the possible implementation in code. Thanks to all who have participated so far!

          Show
          Bryan Pendleton added a comment - Attached is 'insertOrderBy.diff', a patch containing code changes, but no new tests. This patch is not intended for commit, just as a way to give us some concrete code to talk about. The patch adds a line to the parser to enable INSERT ... SELECT .. ORDER BY and adds a bit of code to InsertNode to handle the OrderByList. The code in InsertNode mimics the implementation in CursorNode; perhaps there is some way to arrange for these two nodes to share this code? With the patch, derbyall and suites.All run clean, so I've made a first attempt at verifying that this change doesn't break existing behaviors. Also, I verified that a similar change to 10.2 can be made, if desired, and also adds the new behavior without (obviously) breaking existing ones. I'd love to keep the discussion going: either more general comments on the underlying concepts of the feature, or specific comments about the possible implementation in code. Thanks to all who have participated so far!
          Hide
          Daniel John Debrunner added a comment -

          > The point of this feature is to allow the user to ensure that, *if the rows are
          > later selected and ordered by the auto-generated column*, then they will
          > appear in the same order as they appeared during the INSERT.

          Once/if Derby supports the cycle option of identity columns (CYCLE) then this would no longer be guaranteed,
          so any documentation should clearly indicated the restricted cases where such a guarantee would apply.
          (assuming of course the feature is accepted)

          Show
          Daniel John Debrunner added a comment - > The point of this feature is to allow the user to ensure that, *if the rows are > later selected and ordered by the auto-generated column*, then they will > appear in the same order as they appeared during the INSERT. Once/if Derby supports the cycle option of identity columns (CYCLE) then this would no longer be guaranteed, so any documentation should clearly indicated the restricted cases where such a guarantee would apply. (assuming of course the feature is accepted)
          Hide
          Bernt M. Johnsen added a comment -

          Hi Bryan. I see your point and the need wrt. the generated columns. I have tried to dive into the SQL 2003 standard to see if this extension violates any SQL semantics, and I admit I can't find any (although the construct is a bit "un-SQL-like", but so are many constructs other SQL databases ). But it will restrain us from some optimizations in the future, e.g. if the result from the subquery is very large, it could be partitioned and the partitions inserted in parallell into the base table (maybe far fetched?).

          I see from the patch that you extend the insert statement and not generally the subquery. That is good. Is it an idea also to restrict the use of the order by clause in the insert statement to base tables with generated columns?

          Show
          Bernt M. Johnsen added a comment - Hi Bryan. I see your point and the need wrt. the generated columns. I have tried to dive into the SQL 2003 standard to see if this extension violates any SQL semantics, and I admit I can't find any (although the construct is a bit "un-SQL-like", but so are many constructs other SQL databases ). But it will restrain us from some optimizations in the future, e.g. if the result from the subquery is very large, it could be partitioned and the partitions inserted in parallell into the base table (maybe far fetched?). I see from the patch that you extend the insert statement and not generally the subquery. That is good. Is it an idea also to restrict the use of the order by clause in the insert statement to base tables with generated columns?
          Hide
          Knut Anders Hatlen added a comment -

          If the temp1 table mentioned in the description had a unique constraint on s, and temp2.i were declared as generated by default as identity (instead of always as identity), the following query would be pretty close to what you want:

          insert into temp2 select count(b.s), a.s from temp1 a, temp1 b where b.s <= a.s group by a.s

          It's not a perfect match, but it would at least give the identity column the same ordering as the s column. Some extra tweaking is needed if temp2 is not empty when the insert is executed.

          Show
          Knut Anders Hatlen added a comment - If the temp1 table mentioned in the description had a unique constraint on s, and temp2.i were declared as generated by default as identity (instead of always as identity), the following query would be pretty close to what you want: insert into temp2 select count(b.s), a.s from temp1 a, temp1 b where b.s <= a.s group by a.s It's not a perfect match, but it would at least give the identity column the same ordering as the s column. Some extra tweaking is needed if temp2 is not empty when the insert is executed.
          Hide
          Christian d'Heureuse added a comment -

          > But it will restrain us from some optimizations in the future,

          I don't agree. An INSERT ... SELECT without an "ORDER BY" can still be optimized with parallel processing in the future. Only the new "INSERT ... SELECT ... ORDER BY" cannot be processed in parallel.

          > insert into temp2 select count(b.s), a.s from temp1 a, temp1 b where b.s <= a.s group by a.s

          This would be an alternative, but it's very slow for large tables, because the complexity is in the order of O(n*(n/2)).

          Show
          Christian d'Heureuse added a comment - > But it will restrain us from some optimizations in the future, I don't agree. An INSERT ... SELECT without an "ORDER BY" can still be optimized with parallel processing in the future. Only the new "INSERT ... SELECT ... ORDER BY" cannot be processed in parallel. > insert into temp2 select count(b.s), a.s from temp1 a, temp1 b where b.s <= a.s group by a.s This would be an alternative, but it's very slow for large tables, because the complexity is in the order of O(n*(n/2)).
          Hide
          Bryan Pendleton added a comment -

          Thanks to all for the continued feedback!

          Attached is a second version of the proposed patch. Again, this is
          not for commit, just to continue to stimulate the discussion. This
          patch has two changes from the previous patch:
          1) An error message is added to InsertNode's initializer to reject
          the use of ORDER BY for INSERT ... VALUES statements
          2) SingleChildResultSetNode is changed to override the
          pushOrderByList() method and to pass the orderByList on to its
          childResult. This is important when there is a NormalizeResultSetNode,
          as occurs when the number of columns being SELECTed does not
          match the number of columns being INSERTed. This situation arises
          when using auto-generated columns.

          Unfortunately, the original script from Christian is still not working
          correctly with my patch. The ORDER BY list is properly passed to
          the optimizer, and an OrderBy node is constructed to cause the
          results to be generated in sorted order, but something goes wrong
          and the final result is not properly sorted.

          Examples without auto-generated columns seem to be working fine.

          I'll continue working on this problem to figure out why the sort isn't working.

          Also attached is 'samples.ij', which contains some SQL samples
          that others might find interesting when experimenting with the patch,
          and which I hope to someday turn into full-fledged test cases.

          As I said above, none of this is for commit; I'm just trying to keep moving
          the discussion and the implementation along and share the work
          that I've done so far.

          Show
          Bryan Pendleton added a comment - Thanks to all for the continued feedback! Attached is a second version of the proposed patch. Again, this is not for commit, just to continue to stimulate the discussion. This patch has two changes from the previous patch: 1) An error message is added to InsertNode's initializer to reject the use of ORDER BY for INSERT ... VALUES statements 2) SingleChildResultSetNode is changed to override the pushOrderByList() method and to pass the orderByList on to its childResult. This is important when there is a NormalizeResultSetNode, as occurs when the number of columns being SELECTed does not match the number of columns being INSERTed. This situation arises when using auto-generated columns. Unfortunately, the original script from Christian is still not working correctly with my patch. The ORDER BY list is properly passed to the optimizer, and an OrderBy node is constructed to cause the results to be generated in sorted order, but something goes wrong and the final result is not properly sorted. Examples without auto-generated columns seem to be working fine. I'll continue working on this problem to figure out why the sort isn't working. Also attached is 'samples.ij', which contains some SQL samples that others might find interesting when experimenting with the patch, and which I hope to someday turn into full-fledged test cases. As I said above, none of this is for commit; I'm just trying to keep moving the discussion and the implementation along and share the work that I've done so far.
          Hide
          A B added a comment -

          > 1) An error message is added to InsertNode's initializer to reject
          > the use of ORDER BY for INSERT ... VALUES statements

          Is there a technical reason for this restriction or is this just an incremental development and/or what-you're-willing-to-work-on-right-now restriction (which would be fine)?

          I scanned through the comments for this issue and didn't see anything explicit about why a SELECT ... VALUES ... ORDER BY statement should throw an error, so I'm just wondering...

          Show
          A B added a comment - > 1) An error message is added to InsertNode's initializer to reject > the use of ORDER BY for INSERT ... VALUES statements Is there a technical reason for this restriction or is this just an incremental development and/or what-you're-willing-to-work-on-right-now restriction (which would be fine)? I scanned through the comments for this issue and didn't see anything explicit about why a SELECT ... VALUES ... ORDER BY statement should throw an error, so I'm just wondering...
          Hide
          Bryan Pendleton added a comment -

          Hi Army, thanks for the question!

          No, there was no technical reason for the restriction.

          I think that ORDER BY clauses for VALUES expressions need
          to order by a column position. That is:
          VALUES 'b17', 'b1', 'b52' ORDER BY 1
          makes sense, as does
          VALUES ('ham', 50), ('eggs', 100), ('spam', 20) ORDER BY 2
          but
          VALUES 'x', 'a', 'm' ORDER BY c1
          does not make sense, since columns in VALUES expressions
          don't have names.

          I'll re-visit the VALUES restriction and see how to make it work.

          Meanwhile, I discovered a major problem in my previous patch; I had
          inserted the bind processing for order by columns into the
          InsertNode.bind() method in the wrong place, so it was binding the
          columns incorrectly in the case(s) where NormalizeResultSetNode
          instances were generated.

          I'm working on a 3rd version of the patch, to incorporate all this
          new knowledge

          Show
          Bryan Pendleton added a comment - Hi Army, thanks for the question! No, there was no technical reason for the restriction. I think that ORDER BY clauses for VALUES expressions need to order by a column position. That is: VALUES 'b17', 'b1', 'b52' ORDER BY 1 makes sense, as does VALUES ('ham', 50), ('eggs', 100), ('spam', 20) ORDER BY 2 but VALUES 'x', 'a', 'm' ORDER BY c1 does not make sense, since columns in VALUES expressions don't have names. I'll re-visit the VALUES restriction and see how to make it work. Meanwhile, I discovered a major problem in my previous patch; I had inserted the bind processing for order by columns into the InsertNode.bind() method in the wrong place, so it was binding the columns incorrectly in the case(s) where NormalizeResultSetNode instances were generated. I'm working on a 3rd version of the patch, to incorporate all this new knowledge
          Hide
          Mike Matrigali added a comment -

          Does Derby today guarantee anything other than uniqueness from "generated always" fields? The current implementation
          happens to return increasing keys, but do we ever guarantee that, and should we? Will it limit the ability in the future to
          optimize concurrency with respect to generating keys. It may be easier to guarantee increasing keys within a single statement, but not across threads inserting into the same table.

          An example of a cross thread optimization would be the following. Today we allocate a chunk of keys and cache them in
          one place in the system. That place is a bottleneck for all concurrent inserters. To completely alleviate the bottleneck we
          could allocate a group of keys for each thread doing inserts, then they would only bottleneck when they had used up the
          group of keys. And if we didn't want to "lose" key space we might reuse the keys from the group that weren't exhausted which
          then for a single thread might result in keys not incrementing any more. I believe there are some issues reported in jira about concurrency problems with generated keys, so work in this area may be required in the future.

          Show
          Mike Matrigali added a comment - Does Derby today guarantee anything other than uniqueness from "generated always" fields? The current implementation happens to return increasing keys, but do we ever guarantee that, and should we? Will it limit the ability in the future to optimize concurrency with respect to generating keys. It may be easier to guarantee increasing keys within a single statement, but not across threads inserting into the same table. An example of a cross thread optimization would be the following. Today we allocate a chunk of keys and cache them in one place in the system. That place is a bottleneck for all concurrent inserters. To completely alleviate the bottleneck we could allocate a group of keys for each thread doing inserts, then they would only bottleneck when they had used up the group of keys. And if we didn't want to "lose" key space we might reuse the keys from the group that weren't exhausted which then for a single thread might result in keys not incrementing any more. I believe there are some issues reported in jira about concurrency problems with generated keys, so work in this area may be required in the future.
          Hide
          Bryan Pendleton added a comment -

          Mike asked: "The current implementation happens to return increasing keys, but do we ever guarantee that, and should we?"

          I believe the answer is that we guarantee to obey the START WITH and
          INCREMENT BY attributes of the generated column At:
          http://db.apache.org/derby/docs/dev/ref/rrefsqlj37836.html#rrefsqlj37836
          we say "For SMALLINT, INT, and BIGINT columns with identity attributes, Derby automatically assigns increasing integer values to the column."

          Later, we also say: "And if you specify a negative number for the increment value, Derby decrements the value with each insert. If this value is positive, Derby increments the value with each insert. "

          It's hard to see how to interpret the "START WITH" and "INCREMENT BY"
          attributes of the generated column spec without providing such guarantees.
          INCREMENT BY 1 needs to increment by 1.

          I agree with you that relaxing these guarantees could allow higher-performing
          implementations in the future.

          But I think that many applications are already depending on the current behavior of the
          START WITH and INCREMENT BY properties.

          Show
          Bryan Pendleton added a comment - Mike asked: "The current implementation happens to return increasing keys, but do we ever guarantee that, and should we?" I believe the answer is that we guarantee to obey the START WITH and INCREMENT BY attributes of the generated column At: http://db.apache.org/derby/docs/dev/ref/rrefsqlj37836.html#rrefsqlj37836 we say "For SMALLINT, INT, and BIGINT columns with identity attributes, Derby automatically assigns increasing integer values to the column." Later, we also say: "And if you specify a negative number for the increment value, Derby decrements the value with each insert. If this value is positive, Derby increments the value with each insert. " It's hard to see how to interpret the "START WITH" and "INCREMENT BY" attributes of the generated column spec without providing such guarantees. INCREMENT BY 1 needs to increment by 1. I agree with you that relaxing these guarantees could allow higher-performing implementations in the future. But I think that many applications are already depending on the current behavior of the START WITH and INCREMENT BY properties.
          Hide
          Bernt M. Johnsen added a comment -

          Bryan wrote:
          > It's hard to see how to interpret the "START WITH" and "INCREMENT BY"
          > attributes of the generated column spec without providing such guarantees.
          > INCREMENT BY 1 needs to increment by 1.

          Actually, the SQL standard does not require that we increment by 1 but
          by N*1 where N is an non-negative integer. See Ch 9.21 which says that
          the next value calculated like this:

          "If there exists a non-negative integer N such that SMIN <= CBV + N

          • INC <= SMAX and the value (CBV + N * INC) has not already been
            returned in the current cycle, then let V1 be (CBV + N * INC)."

          So any value N * INC between the current base value (CBV) and the
          limit (SMAX (or SMIN if the generator is decreasing)) is legal and
          holes may not be reclaimed unless the base value is altered
          explicitely (that is for non-cyclic generators, which is what Derby
          supports).

          No well behaved application should depend on the increment actually to
          be 1 every time (although it will be in most cases).

          Show
          Bernt M. Johnsen added a comment - Bryan wrote: > It's hard to see how to interpret the "START WITH" and "INCREMENT BY" > attributes of the generated column spec without providing such guarantees. > INCREMENT BY 1 needs to increment by 1. Actually, the SQL standard does not require that we increment by 1 but by N*1 where N is an non-negative integer. See Ch 9.21 which says that the next value calculated like this: "If there exists a non-negative integer N such that SMIN <= CBV + N INC <= SMAX and the value (CBV + N * INC) has not already been returned in the current cycle, then let V1 be (CBV + N * INC)." So any value N * INC between the current base value (CBV) and the limit (SMAX (or SMIN if the generator is decreasing)) is legal and holes may not be reclaimed unless the base value is altered explicitely (that is for non-cyclic generators, which is what Derby supports). No well behaved application should depend on the increment actually to be 1 every time (although it will be in most cases).
          Hide
          Bryan Pendleton added a comment -

          It turns out that I had a major misunderstanding of the control
          flow of INSERT ... SELECT and how it interacts with generated
          identity columns.

          Consider the original example from this issue:

          create table temp1(s varchar(10));
          insert into temp1 values 'x','a','c','b','a';
          create table temp2(i integer not null generated always as identity,
          s varchar(10));
          insert into temp2(s) select s from temp1 order by s;

          An INSERT ... SELECT statement has 3 basic stages of processing:

          • the SELECT statement retrieves rows of data
          • each retrieved row is transformed into the form needed by the
            target table
          • each row is then inserted into the target table

          From a QueryTreeNode point of view, there is an InsertNode, which
          points to a NormalizeResultSetNode, which in turn points either to
          a UnionNode (in the case of INSERT ... VALUES ...), or to a
          ProjectRestrictNode/SelectNode tree (in the case of INSERT ... SELECT ...).

          My assumption when I started investigating this issue was that
          the values of the identity column were generated during the 3rd
          stage, at the time of insert into the target table, by processing
          that would occur in InsertResultSet.java

          Or, at least, that the identity values were generated during
          the 2nd stage, as part of projecting and normalizing the source
          data into the form needed by the target table, by processing that
          would occur in NormalizeResultSet.java.

          Unfortunately, after stepping through the code in detail, I've
          come to the conclusion that the identity column values are
          actually generated during the first stage, when the rows of
          data are retrieved by the SELECT and are being fed into the sorter.

          Thus, the identity column's values are generated prior to the
          sort, and hence there is no guarantee that the order of the
          values bears any relationship to the ORDER BY clause.

          At the instant when the new value of the identity column is
          being generated for the row that we're processing, the stack looks
          like this:
          DataDictionaryImpl.getSetAutoincrementValue
          InsertResultSet.getSetAutoincrementValue
          BaseActivation.getSetAutoincrementValue
          <generated bytecode method "e1">
          ProjectRestrictResultSet.doProjection
          ProjectRestrictResultSet.getNextRowCore
          SortResultSet.getRowFromResultSet
          SortResultSet.getNextRowFromRS
          SortResultSet.loadSorter
          SortResultSet.openCore
          NormalizeResultSet.openCore
          InsertResultSet.open

          Note that we are still loading the sorter at this point; the sort
          has not yet been performed.

          My conclusion at this point is that I would have to make some much
          more substantial changes to INSERT ... SELECT in order to get it
          to perform the identity column generation after the ORDER BY
          processing had occurred.

          I'm attaching an updated version of the patch that I was working with,
          and an updated version of the sample SQL statements that I was
          trying to run. The patched code has some interesting behaviors; it
          definitely performs the ORDER BY processing faithfully.

          It just doesn't provide the identity column guarantees that
          I was interested in exploring.

          Show
          Bryan Pendleton added a comment - It turns out that I had a major misunderstanding of the control flow of INSERT ... SELECT and how it interacts with generated identity columns. Consider the original example from this issue: create table temp1(s varchar(10)); insert into temp1 values 'x','a','c','b','a'; create table temp2(i integer not null generated always as identity, s varchar(10)); insert into temp2(s) select s from temp1 order by s; An INSERT ... SELECT statement has 3 basic stages of processing: the SELECT statement retrieves rows of data each retrieved row is transformed into the form needed by the target table each row is then inserted into the target table From a QueryTreeNode point of view, there is an InsertNode, which points to a NormalizeResultSetNode, which in turn points either to a UnionNode (in the case of INSERT ... VALUES ...), or to a ProjectRestrictNode/SelectNode tree (in the case of INSERT ... SELECT ...). My assumption when I started investigating this issue was that the values of the identity column were generated during the 3rd stage, at the time of insert into the target table, by processing that would occur in InsertResultSet.java Or, at least, that the identity values were generated during the 2nd stage, as part of projecting and normalizing the source data into the form needed by the target table, by processing that would occur in NormalizeResultSet.java. Unfortunately, after stepping through the code in detail, I've come to the conclusion that the identity column values are actually generated during the first stage, when the rows of data are retrieved by the SELECT and are being fed into the sorter. Thus, the identity column's values are generated prior to the sort, and hence there is no guarantee that the order of the values bears any relationship to the ORDER BY clause. At the instant when the new value of the identity column is being generated for the row that we're processing, the stack looks like this: DataDictionaryImpl.getSetAutoincrementValue InsertResultSet.getSetAutoincrementValue BaseActivation.getSetAutoincrementValue <generated bytecode method "e1"> ProjectRestrictResultSet.doProjection ProjectRestrictResultSet.getNextRowCore SortResultSet.getRowFromResultSet SortResultSet.getNextRowFromRS SortResultSet.loadSorter SortResultSet.openCore NormalizeResultSet.openCore InsertResultSet.open Note that we are still loading the sorter at this point; the sort has not yet been performed. My conclusion at this point is that I would have to make some much more substantial changes to INSERT ... SELECT in order to get it to perform the identity column generation after the ORDER BY processing had occurred. I'm attaching an updated version of the patch that I was working with, and an updated version of the sample SQL statements that I was trying to run. The patched code has some interesting behaviors; it definitely performs the ORDER BY processing faithfully. It just doesn't provide the identity column guarantees that I was interested in exploring.
          Hide
          Daniel John Debrunner added a comment -

          I think your latest findings also relate to DERBY-3, gaps in the identity sequence.

          BTW - Bernt raised a veto (-1) against this issue, that stands until he rescinds it.

          Show
          Daniel John Debrunner added a comment - I think your latest findings also relate to DERBY-3 , gaps in the identity sequence. BTW - Bernt raised a veto (-1) against this issue, that stands until he rescinds it.
          Hide
          Bryan Pendleton added a comment -

          I'm not actively working on this anymore, although it was a very interesting learning experience and I'm quite grateful for all the comments and suggestions!

          Hopefully this discussion will be useful to those who study this area further in the future.

          Show
          Bryan Pendleton added a comment - I'm not actively working on this anymore, although it was a very interesting learning experience and I'm quite grateful for all the comments and suggestions! Hopefully this discussion will be useful to those who study this area further in the future.
          Hide
          Bernt M. Johnsen added a comment -

          Consider my veto temporary. When implementing a standard (in this case ISO 9075) one should stick to the standard. If we decide to violate and/or extend the standard, we must be very sure that it won't create future trouble for us. Such violations/extensions of standards have a nasty habit of backfiring at some later stage (I can point you to numerous examples of that). If we are convinced that this extension of the standard is without such potential problems, I think the feature should be implemented (if someone feels the itch, of course).

          Show
          Bernt M. Johnsen added a comment - Consider my veto temporary. When implementing a standard (in this case ISO 9075) one should stick to the standard. If we decide to violate and/or extend the standard, we must be very sure that it won't create future trouble for us. Such violations/extensions of standards have a nasty habit of backfiring at some later stage (I can point you to numerous examples of that). If we are convinced that this extension of the standard is without such potential problems, I think the feature should be implemented (if someone feels the itch, of course).
          Hide
          Bernt M. Johnsen added a comment -

          Anyway, as Christian metions, implementing ROW_NUMBER() is an alternate solution, which also will solve DERBY-581. Thus both issues may be solved with one feature which also happen to be defined in the SQL standard. I think that is a much more viable approach.

          Show
          Bernt M. Johnsen added a comment - Anyway, as Christian metions, implementing ROW_NUMBER() is an alternate solution, which also will solve DERBY-581 . Thus both issues may be solved with one feature which also happen to be defined in the SQL standard. I think that is a much more viable approach.
          Hide
          Bryan Pendleton added a comment -

          As Christian noted, one way for users to work around this limitation is to use
          a stored procedure. In my experiments, I found that a procedure such as the
          following can work acceptably:

          public static final void insertSelectOrderBy(String iString, String sString) throws SQLException
          {
          Connection conn = DriverManager.getConnection("jdbc:default:connection");
          PreparedStatement iStmt = conn.prepareStatement(iString);
          Statement s = conn.createStatement();
          ResultSet rs = s.executeQuery(sString);
          int numParms = iStmt.getParameterMetaData().getParameterCount();
          while (rs.next())

          { for (int c = 1; c <= numParms; c++) iStmt.setObject(c, rs.getObject(c)); iStmt.executeUpdate(); }

          rs.close(); s.close(); iStmt.close(); conn.close();
          }

          I'm posting this possible workaround in the hopes that others may find it useful.

          Show
          Bryan Pendleton added a comment - As Christian noted, one way for users to work around this limitation is to use a stored procedure. In my experiments, I found that a procedure such as the following can work acceptably: public static final void insertSelectOrderBy(String iString, String sString) throws SQLException { Connection conn = DriverManager.getConnection("jdbc:default:connection"); PreparedStatement iStmt = conn.prepareStatement(iString); Statement s = conn.createStatement(); ResultSet rs = s.executeQuery(sString); int numParms = iStmt.getParameterMetaData().getParameterCount(); while (rs.next()) { for (int c = 1; c <= numParms; c++) iStmt.setObject(c, rs.getObject(c)); iStmt.executeUpdate(); } rs.close(); s.close(); iStmt.close(); conn.close(); } I'm posting this possible workaround in the hopes that others may find it useful.
          Hide
          Knut Anders Hatlen added a comment -

          It might be worth checking if SQL 2008 allows ORDER BY in INSERT INTO
          ... SELECT FROM ... . Dag wrote the following in DERBY-4079:

          > Note that both ORDER BY and the new clauses above are allowed also in
          > subqueries in the new version of the SQL standard (section 7.13). I
          > only propose to include this at the top level in DERBY for now. (ORDER
          > BY is presently also not allowed in subqueries in Derby since SQL
          > didn't allow for this until SQL 2008 either).

          Show
          Knut Anders Hatlen added a comment - It might be worth checking if SQL 2008 allows ORDER BY in INSERT INTO ... SELECT FROM ... . Dag wrote the following in DERBY-4079 : > Note that both ORDER BY and the new clauses above are allowed also in > subqueries in the new version of the SQL standard (section 7.13). I > only propose to include this at the top level in DERBY for now. (ORDER > BY is presently also not allowed in subqueries in Derby since SQL > didn't allow for this until SQL 2008 either).
          Hide
          Dag H. Wanvik added a comment -

          Knut, its seems it does, cf. these productions (quote):

          <query expression> ::=
          [ <with clause> ] <query expression body>
          [ <order by clause> ] [ <result offset clause> ] [ <fetch first clause> ]

          <insert statement> ::=
          INSERT INTO <insertion target> <insert columns and source>

          <insert columns and source> ::=
          <from subquery>

          <from constructor>
          <from default>

          <from subquery> ::=
          [ <left paren> <insert column list> <right paren> ]
          [ <override clause> ]
          <query expression>

          This is just the syntax, however, but I haven't found any provision that would bar it.

          Show
          Dag H. Wanvik added a comment - Knut, its seems it does, cf. these productions (quote): <query expression> ::= [ <with clause> ] <query expression body> [ <order by clause> ] [ <result offset clause> ] [ <fetch first clause> ] <insert statement> ::= INSERT INTO <insertion target> <insert columns and source> <insert columns and source> ::= <from subquery> <from constructor> <from default> <from subquery> ::= [ <left paren> <insert column list> <right paren> ] [ <override clause> ] <query expression> This is just the syntax, however, but I haven't found any provision that would bar it.
          Hide
          Dag H. Wanvik added a comment -

          Please note that INSERT INTO ... VALUES ... ORDER BY is not allowed by the 2008 standard, cf
          the explanation given here (DERBY-4413):

          https://issues.apache.org/jira/browse/DERBY-4413?focusedCommentId=12769492&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12769492

          Show
          Dag H. Wanvik added a comment - Please note that INSERT INTO ... VALUES ... ORDER BY is not allowed by the 2008 standard, cf the explanation given here ( DERBY-4413 ): https://issues.apache.org/jira/browse/DERBY-4413?focusedCommentId=12769492&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12769492
          Hide
          Christian d'Heureuse added a comment -

          Hi Dag,

          Do you mean by "not allowed" that "INSERT INTO ... ORDER BY" is not conforming to the standard, or is it explicitly "not allowed"?

          Is it only not conforming with the VALUES clause, or is "INSERT INTO ... SELECT ... ORDER BY" also not conforming?

          Show
          Christian d'Heureuse added a comment - Hi Dag, Do you mean by "not allowed" that "INSERT INTO ... ORDER BY" is not conforming to the standard, or is it explicitly "not allowed"? Is it only not conforming with the VALUES clause, or is "INSERT INTO ... SELECT ... ORDER BY" also not conforming?
          Hide
          Dag H. Wanvik added a comment -

          Christian, it's only the VALUES .. ORDER BY that is (implicitly by the syntax rules) not conforming.
          It does not say explicitly that is is not allowed, so it would be an extension if we implemented it.
          A SELECT .. ORDER BY is allowed as far as I can see.

          Show
          Dag H. Wanvik added a comment - Christian, it's only the VALUES .. ORDER BY that is (implicitly by the syntax rules) not conforming. It does not say explicitly that is is not allowed, so it would be an extension if we implemented it. A SELECT .. ORDER BY is allowed as far as I can see.
          Hide
          Dag H. Wanvik added a comment -

          My interest in warming this up again is DERBY-4397.
          I plan to use a modified version of this patch as part of that work.

          This patch is thus just a baseline of my work on this "subproblem" of
          INSERT combined with ORDER BY and is not intended for commit. I just
          post it here for the record. It passes regression tests, and besides
          whats discussed below it seems to work for what I have thrown at it.

          This patch builds on Bryan's latest patch proposal for DERBY-4
          (thanks!), and adds a JUnit test. It does not solve the problem
          ("early" evaluation of identity columns, i.e. they are evaluated
          before the ordering) which Bryan observed. I think the correct
          solution [1] is to defer all default assignments (identity, plain
          DEFAULT, generated columns) till after the result set is ready, as is
          currently done for generated columns. If a target column list is given
          in the INSERT, the result set would then be "narrow" (i.e. not
          containing unspecified columns) until after the ordering. A top PRN
          would be used to inject the missing columns with their (default)
          values. This would solve DERBY-4.

          Meanwhile, in preparation for that we should forbid ORDER BY on a
          VALUES clause if it contains DEFAULT values (as a minimum, since
          logically they have no value until after ordering; and physically, in
          the case of generated columns, they are null and can't be used for
          sorting), and probably also forbid ORDER BY on VALUES at all, as per
          the standard (see explanation on why in DERBY-4413). As it stands,
          this patch does the latter.

          Also, DEFAULT should be allowed only in simple VALUES clause, see
          DERBY-4426, ensuring it could never be the subject of sorting (e.g. in
          a UNION).

          [1] See discussion on DERBY-4413.

          Patch details:

          M java/engine/org/apache/derby/impl/sql/compile/SingleChildResultSetNode.java

          Added pushOrderByList, InsertNode uses analogue to pattern used by CursorNode.

          M java/engine/org/apache/derby/impl/sql/compile/NormalizeResultSetNode.java

          Added bindExpressions.

          M java/engine/org/apache/derby/impl/sql/compile/ResultColumn.java

          Removed a Sanity assert for a case and added a comment for when the
          scenario can happen.

          M java/engine/org/apache/derby/impl/sql/compile/CursorNode.java
          M java/engine/org/apache/derby/impl/sql/compile/OrderByList.java
          M java/engine/org/apache/derby/impl/sql/compile/IntersectOrExceptNode.java
          M java/engine/org/apache/derby/impl/sql/compile/OrderByColumn.java

          Added colMap to account for column subset/reordering which happens as
          a result of enhanceRCLForInsert called from InsertNode. Note that code
          was written prior to our realizing that a simple VALUES clause could
          not be followed by ORDER BY, so some code can probably be removed now
          if that decision stands.

          M java/engine/org/apache/derby/impl/sql/compile/InsertNode.java

          Add logic for OrderBy.

          M java/engine/org/apache/derby/impl/sql/compile/sqlgrammar.jj

          Special syntax allowance for ORDER BY in INSERT. Will be
          moved/generalized to subqueries for DERBY-4397. Added a check that
          ORDER BY can not be applied to a simple VALUES clause.

          A java/testing/org/apache/derbyTesting/functionTests/tests/lang/OrderByInSubqueries.java

          Also contains tests for VALUES .. ORDER BY, but commented out. If the
          check in sqlgrammar is removed these tests can be re-enabled and do
          work.

          M java/engine/org/apache/derby/impl/sql/compile/ResultSetNode.java
          M java/engine/org/apache/derby/impl/sql/compile/AlterTableNode.java
          M java/engine/org/apache/derby/impl/sql/compile/QueryTreeNode.java
          M java/engine/org/apache/derby/impl/sql/compile/ColumnDefinitionNode.java
          M java/engine/org/apache/derby/impl/sql/compile/DMLModStatementNode.java

          Code cleanups only.

          Show
          Dag H. Wanvik added a comment - My interest in warming this up again is DERBY-4397 . I plan to use a modified version of this patch as part of that work. This patch is thus just a baseline of my work on this "subproblem" of INSERT combined with ORDER BY and is not intended for commit. I just post it here for the record. It passes regression tests, and besides whats discussed below it seems to work for what I have thrown at it. This patch builds on Bryan's latest patch proposal for DERBY-4 (thanks!), and adds a JUnit test. It does not solve the problem ("early" evaluation of identity columns, i.e. they are evaluated before the ordering) which Bryan observed. I think the correct solution [1] is to defer all default assignments (identity, plain DEFAULT, generated columns) till after the result set is ready, as is currently done for generated columns. If a target column list is given in the INSERT, the result set would then be "narrow" (i.e. not containing unspecified columns) until after the ordering. A top PRN would be used to inject the missing columns with their (default) values. This would solve DERBY-4 . Meanwhile, in preparation for that we should forbid ORDER BY on a VALUES clause if it contains DEFAULT values (as a minimum, since logically they have no value until after ordering; and physically, in the case of generated columns, they are null and can't be used for sorting), and probably also forbid ORDER BY on VALUES at all, as per the standard (see explanation on why in DERBY-4413 ). As it stands, this patch does the latter. Also, DEFAULT should be allowed only in simple VALUES clause, see DERBY-4426 , ensuring it could never be the subject of sorting (e.g. in a UNION). [1] See discussion on DERBY-4413 . Patch details: M java/engine/org/apache/derby/impl/sql/compile/SingleChildResultSetNode.java Added pushOrderByList, InsertNode uses analogue to pattern used by CursorNode. M java/engine/org/apache/derby/impl/sql/compile/NormalizeResultSetNode.java Added bindExpressions. M java/engine/org/apache/derby/impl/sql/compile/ResultColumn.java Removed a Sanity assert for a case and added a comment for when the scenario can happen. M java/engine/org/apache/derby/impl/sql/compile/CursorNode.java M java/engine/org/apache/derby/impl/sql/compile/OrderByList.java M java/engine/org/apache/derby/impl/sql/compile/IntersectOrExceptNode.java M java/engine/org/apache/derby/impl/sql/compile/OrderByColumn.java Added colMap to account for column subset/reordering which happens as a result of enhanceRCLForInsert called from InsertNode. Note that code was written prior to our realizing that a simple VALUES clause could not be followed by ORDER BY, so some code can probably be removed now if that decision stands. M java/engine/org/apache/derby/impl/sql/compile/InsertNode.java Add logic for OrderBy. M java/engine/org/apache/derby/impl/sql/compile/sqlgrammar.jj Special syntax allowance for ORDER BY in INSERT. Will be moved/generalized to subqueries for DERBY-4397 . Added a check that ORDER BY can not be applied to a simple VALUES clause. A java/testing/org/apache/derbyTesting/functionTests/tests/lang/OrderByInSubqueries.java Also contains tests for VALUES .. ORDER BY, but commented out. If the check in sqlgrammar is removed these tests can be re-enabled and do work. M java/engine/org/apache/derby/impl/sql/compile/ResultSetNode.java M java/engine/org/apache/derby/impl/sql/compile/AlterTableNode.java M java/engine/org/apache/derby/impl/sql/compile/QueryTreeNode.java M java/engine/org/apache/derby/impl/sql/compile/ColumnDefinitionNode.java M java/engine/org/apache/derby/impl/sql/compile/DMLModStatementNode.java Code cleanups only.
          Hide
          Knut Anders Hatlen added a comment -

          Just to be clear, by the following

          > and probably also forbid ORDER BY on VALUES at all

          I take it you mean that it should be prohibited in an INSERT context,
          and not generally for all VALUES clauses?

          That is, this will be disallowed:

          INSERT INTO T VALUES ('a','b'),('b','a') ORDER BY 2

          Whereas this will still be allowed:

          VALUES ('a','b'),('b','a') ORDER BY 2

          If this is what you suggest, +1.

          > Also, DEFAULT should be allowed only in simple VALUES clause, see
          > DERBY-4426, ensuring it could never be the subject of sorting (e.g. in
          > a UNION).

          +1. I believe the current behaviour is just an unintended side-effect
          of the implementation overloading UnionNode to represent both proper
          UNIONs and multi-row VALUES clauses. Supporting this non-standard
          extension of the syntax doesn't make much sense if it makes it harder
          to get a consistent implementation of the syntax that the standard
          actually defines. Also, I cannot imagine that "INSERT INTO ... VALUES
          1 UNION VALUES DEFAULT" is widely used, if at all.

          Show
          Knut Anders Hatlen added a comment - Just to be clear, by the following > and probably also forbid ORDER BY on VALUES at all I take it you mean that it should be prohibited in an INSERT context, and not generally for all VALUES clauses? That is, this will be disallowed: INSERT INTO T VALUES ('a','b'),('b','a') ORDER BY 2 Whereas this will still be allowed: VALUES ('a','b'),('b','a') ORDER BY 2 If this is what you suggest, +1. > Also, DEFAULT should be allowed only in simple VALUES clause, see > DERBY-4426 , ensuring it could never be the subject of sorting (e.g. in > a UNION). +1. I believe the current behaviour is just an unintended side-effect of the implementation overloading UnionNode to represent both proper UNIONs and multi-row VALUES clauses. Supporting this non-standard extension of the syntax doesn't make much sense if it makes it harder to get a consistent implementation of the syntax that the standard actually defines. Also, I cannot imagine that "INSERT INTO ... VALUES 1 UNION VALUES DEFAULT" is widely used, if at all.
          Hide
          Dag H. Wanvik added a comment - - edited

          You understood me correctly, Knut, cf. the explanation in DERBY-4413.

          Show
          Dag H. Wanvik added a comment - - edited You understood me correctly, Knut, cf. the explanation in DERBY-4413 .
          Hide
          Knut Anders Hatlen added a comment -

          This issue was fixed as part of DERBY-4397, so closing it as a duplicate.

          Show
          Knut Anders Hatlen added a comment - This issue was fixed as part of DERBY-4397 , so closing it as a duplicate.
          Hide
          Knut Anders Hatlen added a comment -

          [bulk update] Close all resolved issues that haven't been updated for more than one year.

          Show
          Knut Anders Hatlen added a comment - [bulk update] Close all resolved issues that haven't been updated for more than one year.

            People

            • Assignee:
              Unassigned
              Reporter:
              Christian d'Heureuse
            • Votes:
              2 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development