Derby
  1. Derby
  2. DERBY-3

Identity column not filled consecutively with "insert ... select distinct"

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Duplicate
    • Affects Version/s: 10.2.1.6
    • Fix Version/s: None
    • Component/s: SQL
    • Labels:
      None
    • Environment:
      Derby Snapshot svnversion 46005
      or Cloudscape 10.0 GA
      Derby 10.2beta

      Description

      When an "insert ... select distinct" is used to fill a table with an identity column, gaps occur in the identity column. The caps correspond to the rows of the source table that have been suppressed with "distinct".

      Example:

      create table temp1 (
      s varchar(10));
      insert into temp1 values 'a','a','a','b','c','c','c','d';

      create table temp2 (
      i integer not null
      generated always as identity
      primary key,
      s varchar(10));

      insert into temp2 (s)
      select distinct s from temp1;
      select * from temp2;

      output:
      1 a
      4 b
      5 c
      8 d

        Issue Links

          Activity

          Hide
          Satheesh Bandaram added a comment -

          While it may be desirable not to have gaps in identity values, the current implementation does match required uniqueness.

          Show
          Satheesh Bandaram added a comment - While it may be desirable not to have gaps in identity values, the current implementation does match required uniqueness.
          Hide
          Kathey Marsden added a comment -

          Per Satheesh's comment in May this does not appear to be a bug

          Show
          Kathey Marsden added a comment - Per Satheesh's comment in May this does not appear to be a bug
          Hide
          Daniel John Debrunner added a comment -

          While I agree it's not a bug, I could see this as an improvement. It would be more natural to have consective values in this case.

          Show
          Daniel John Debrunner added a comment - While I agree it's not a bug, I could see this as an improvement. It would be more natural to have consective values in this case.
          Hide
          Christian d'Heureuse added a comment -

          I used this technique within an SQL script to create consecutive numbers for some tables. I had to use temporary intermediate tables to circumvent this effect.

          I agree that it's not a real "bug", but it's an undesirable effect.
          Kathey, could you please change the issue type from "bug" to "improvement" and the status back to "open"?

          Show
          Christian d'Heureuse added a comment - I used this technique within an SQL script to create consecutive numbers for some tables. I had to use temporary intermediate tables to circumvent this effect. I agree that it's not a real "bug", but it's an undesirable effect. Kathey, could you please change the issue type from "bug" to "improvement" and the status back to "open"?
          Hide
          Kristian Waagan added a comment -

          Ran test script on Derby 10.2beta. The behavior is still the same, you do not get consequtive ids for the given query (with "insert ... select distinct").
          Recategorized issue, as the current behavior is not a bug.

          Show
          Kristian Waagan added a comment - Ran test script on Derby 10.2beta. The behavior is still the same, you do not get consequtive ids for the given query (with "insert ... select distinct"). Recategorized issue, as the current behavior is not a bug.
          Hide
          Knut Anders Hatlen added a comment -

          Just to add another data point, the identity values are consecutive if you put another query around the select distinct:

          ij> insert into temp2 (s) select * from (select distinct s from temp1) t;
          4 rows inserted/updated/deleted
          ij> select * from temp2;
          I |S
          ----------------------
          1 |d
          2 |b
          3 |c
          4 |a

          4 rows selected

          For some reason, the ordering is different too.

          Show
          Knut Anders Hatlen added a comment - Just to add another data point, the identity values are consecutive if you put another query around the select distinct: ij> insert into temp2 (s) select * from (select distinct s from temp1) t; 4 rows inserted/updated/deleted ij> select * from temp2; I |S ---------------------- 1 |d 2 |b 3 |c 4 |a 4 rows selected For some reason, the ordering is different too.
          Hide
          Knut Anders Hatlen added a comment -

          The gaps are probably caused by the early evaluation of identity columns mentioned in Dag's comment on DERBY-4, dated 28/Oct/09. A fix for DERBY-4 will likely fix this issue too.

          Show
          Knut Anders Hatlen added a comment - The gaps are probably caused by the early evaluation of identity columns mentioned in Dag's comment on DERBY-4 , dated 28/Oct/09. A fix for DERBY-4 will likely fix this issue too.
          Hide
          Dag H. Wanvik added a comment -

          I agree with that analysis, Knut.

          Show
          Dag H. Wanvik added a comment - I agree with that analysis, Knut.
          Hide
          Knut Anders Hatlen added a comment -

          This issue has now been fixed by the changes in DERBY-4442, so I'm marking it as a duplicate of that issue and resolving it.

          Show
          Knut Anders Hatlen added a comment - This issue has now been fixed by the changes in DERBY-4442 , so I'm marking it as a duplicate of that issue and resolving it.

            People

            • Assignee:
              Unassigned
              Reporter:
              Christian d'Heureuse
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development