Derby
  1. Derby
  2. DERBY-25

INSERT INTO SELECT DISTINCT ... skips some values for autoincrement column

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Duplicate
    • Affects Version/s: 10.0.2.0
    • Fix Version/s: None
    • Component/s: SQL
    • Labels:
      None

      Description

      Reporting for Mamta Satoor.

      If we use insert into desttable, select
      distinct from source, into a desttable which has autoincrement
      column in it, we might see gaps in the autoincrement column if
      there are duplicated rows in the source table. The reason for
      this is Derby projects values into destination table columns before
      building a distinct resultset from the source table. The piece
      of code doing this is in org.apache.derby.impl.sql.execute.ProjectRestrictResultSet class's getNextRowCore() method where it calls doProjection.

        Activity

        Hide
        Satheesh Bandaram added a comment -

        Duplicate of Derby-3.

        Show
        Satheesh Bandaram added a comment - Duplicate of Derby-3.
        Hide
        Satheesh Bandaram added a comment -

        This is a Duplicate of Derby-3.

        Show
        Satheesh Bandaram added a comment - This is a Duplicate of Derby-3.
        Hide
        Shreyas Kaushik added a comment -

        I have been working on this trying to resolve this. Here are a few comments on this:

        1) The main reason for this happeing is the auto increment value getting projected before the duplicate rows are discovered. But I was not able to get a row from the source ResultSet without actually getting the projected value.

        2) From the activation I discovered that a hierarchy of ResultSets are involved here, where each ResultSet has a specific job and it least bothers where data is coming from or where it is going to send the data. So in this case the ProjectRestrictResultSet is where the projection happens. Any call to get the next row from the source is routed through this ResultSet where the actual projection happens.

        3) One solution that I thought about was doing the elimination of duplicate rows before we do the Projection, since I was unable to figure out how to bypass the projection ( as stated in the above comment) , I thought of using the SYS.SYSCOLUMNS table where Derby stores the latest auto increment value and increment ident and all other details about the table and the column itself.

        4) But one problem in using this table is that whenever updates need to happen to this table it gets locked and if we are in the middle of transaction and some errors take place the table might be in a in consistent state. The Derby docs also say that in the case of using an auto increment column, if a transaction that is using this is rolled back gaps might result as the SYS.SYSCOLUMNS table is not locked for rolling back the auto increment values.

        Show
        Shreyas Kaushik added a comment - I have been working on this trying to resolve this. Here are a few comments on this: 1) The main reason for this happeing is the auto increment value getting projected before the duplicate rows are discovered. But I was not able to get a row from the source ResultSet without actually getting the projected value. 2) From the activation I discovered that a hierarchy of ResultSets are involved here, where each ResultSet has a specific job and it least bothers where data is coming from or where it is going to send the data. So in this case the ProjectRestrictResultSet is where the projection happens. Any call to get the next row from the source is routed through this ResultSet where the actual projection happens. 3) One solution that I thought about was doing the elimination of duplicate rows before we do the Projection, since I was unable to figure out how to bypass the projection ( as stated in the above comment) , I thought of using the SYS.SYSCOLUMNS table where Derby stores the latest auto increment value and increment ident and all other details about the table and the column itself. 4) But one problem in using this table is that whenever updates need to happen to this table it gets locked and if we are in the middle of transaction and some errors take place the table might be in a in consistent state. The Derby docs also say that in the case of using an auto increment column, if a transaction that is using this is rolled back gaps might result as the SYS.SYSCOLUMNS table is not locked for rolling back the auto increment values.
        Hide
        Shreyas Kaushik added a comment -

        I was looking at this issue, trying to make a patch for this. I use the following logic to do this:

        1) Initially previous row is null

        2) SInce this is the first row add it to the temp ResultSet for later insertion.

        3) After this store the current Row as previous Row.

        4) For every pass compare previous row and current row, if same don't insert otherwise
        insert and do a projection.

        I will put the code that I wrote for this inline here. At this point of time I still have some failures while running the tests which I hope to resolve soon.

        thanks
        Shreyas

        -------------------------------Diff for ProjectRestrictResultSet-------------------------------
        Index: java/engine/org/apache/derby/impl/sql/execute/ProjectRestrictResultSet.java
        ===================================================================
        — java/engine/org/apache/derby/impl/sql/execute/ProjectRestrictResultSet.java (revision 106542)
        +++ java/engine/org/apache/derby/impl/sql/execute/ProjectRestrictResultSet.java (working copy)
        @@ -43,8 +43,10 @@
        import org.apache.derby.iapi.error.StandardException;

        import org.apache.derby.iapi.types.RowLocation;
        +import org.apache.derby.iapi.types.Orderable;

        import org.apache.derby.catalog.types.ReferencedColumnsDescriptorImpl;
        +import org.apache.derby.impl.jdbc.Util;

        /**
        @@ -77,6 +79,9 @@

        private ExecRow projRow;

        + private ExecRow previousRow = null;
        + private boolean firstRow = false;
        +
        //
        // class interface
        //
        @@ -245,6 +250,7 @@
        boolean restrict = false;
        DataValueDescriptor restrictBoolean;
        long beginRT = 0;
        + boolean retVal;

        /* Return null if open was short circuited by false constant expression */
        if (shortCircuitOpen)
        @@ -256,8 +262,17 @@
        do
        {
        candidateRow = source.getNextRowCore();

        • if (candidateRow != null)
          +
          + if (candidateRow != null)
          Unknown macro: {+ retVal = isEquals(candidateRow);++ previousRow = candidateRow.getClone();++ if(!retVal && !firstRow)+ continue;++ beginRT = getCurrentTimeMillis(); /* If restriction is null, then all rows qualify */ if (restriction == null)@@ -492,7 +507,8 @@ if (projection != null) { result = (ExecRow) projection.invoke(activation); - }+ }

          +
          else
          {
          result = mappedResultRow;
          @@ -501,10 +517,9 @@
          // Copy any mapped columns from the source
          for (int index = 0; index < projectMapping.length; index++)
          {

        • if (projectMapping[index] != -1)
        • { - result.setColumn(index + 1, sourceRow.getColumn(projectMapping[index])); - }

          + if(projectMapping[index] != -1)

          { + result.setColumn(index + 1, sourceRow.getColumn(projectMapping[index])); + }

          }

        /* We need to reSet the current row after doing the projection */
        @@ -528,6 +543,27 @@
        return source.isForUpdate();
        }

        + private boolean isEquals(ExecRow candidateRow) throws StandardException {
        + boolean notEqual = false;
        + DataValueDescriptor []sourceDesc, destDesc;
        +
        + // Do the checking if a row is being repeated
        + if(previousRow != null) {
        + firstRow = false;
        + sourceDesc = candidateRow.getRowArray();
        + destDesc = previousRow.getRowArray();
        +
        + for ( int i = 0; i < destDesc.length; i++) {
        + // Do the actual comparision here
        + if(!(sourceDesc[i].compare(Orderable.ORDER_OP_EQUALS,destDesc[i],true,true)))

        { + notEqual = true; + }

        + }
        + }
        + if(previousRow == null)
        + firstRow = true;
        + return notEqual;
        + }
        }
        ------------------------------------Diff for ProjectRestrictResultSet ends-------------------------

        Show
        Shreyas Kaushik added a comment - I was looking at this issue, trying to make a patch for this. I use the following logic to do this: 1) Initially previous row is null 2) SInce this is the first row add it to the temp ResultSet for later insertion. 3) After this store the current Row as previous Row. 4) For every pass compare previous row and current row, if same don't insert otherwise insert and do a projection. I will put the code that I wrote for this inline here. At this point of time I still have some failures while running the tests which I hope to resolve soon. thanks Shreyas ------------------------------- Diff for ProjectRestrictResultSet ------------------------------- Index: java/engine/org/apache/derby/impl/sql/execute/ProjectRestrictResultSet.java =================================================================== — java/engine/org/apache/derby/impl/sql/execute/ProjectRestrictResultSet.java (revision 106542) +++ java/engine/org/apache/derby/impl/sql/execute/ProjectRestrictResultSet.java (working copy) @@ -43,8 +43,10 @@ import org.apache.derby.iapi.error.StandardException; import org.apache.derby.iapi.types.RowLocation; +import org.apache.derby.iapi.types.Orderable; import org.apache.derby.catalog.types.ReferencedColumnsDescriptorImpl; +import org.apache.derby.impl.jdbc.Util; /** @@ -77,6 +79,9 @@ private ExecRow projRow; + private ExecRow previousRow = null; + private boolean firstRow = false; + // // class interface // @@ -245,6 +250,7 @@ boolean restrict = false; DataValueDescriptor restrictBoolean; long beginRT = 0; + boolean retVal; /* Return null if open was short circuited by false constant expression */ if (shortCircuitOpen) @@ -256,8 +262,17 @@ do { candidateRow = source.getNextRowCore(); if (candidateRow != null) + + if (candidateRow != null) Unknown macro: {+ retVal = isEquals(candidateRow);++ previousRow = candidateRow.getClone();++ if(!retVal && !firstRow)+ continue;++ beginRT = getCurrentTimeMillis(); /* If restriction is null, then all rows qualify */ if (restriction == null)@@ -492,7 +507,8 @@ if (projection != null) { result = (ExecRow) projection.invoke(activation); - }+ } + else { result = mappedResultRow; @@ -501,10 +517,9 @@ // Copy any mapped columns from the source for (int index = 0; index < projectMapping.length; index++) { if (projectMapping [index] != -1) { - result.setColumn(index + 1, sourceRow.getColumn(projectMapping[index])); - } + if(projectMapping [index] != -1) { + result.setColumn(index + 1, sourceRow.getColumn(projectMapping[index])); + } } /* We need to reSet the current row after doing the projection */ @@ -528,6 +543,27 @@ return source.isForUpdate(); } + private boolean isEquals(ExecRow candidateRow) throws StandardException { + boolean notEqual = false; + DataValueDescriptor []sourceDesc, destDesc; + + // Do the checking if a row is being repeated + if(previousRow != null) { + firstRow = false; + sourceDesc = candidateRow.getRowArray(); + destDesc = previousRow.getRowArray(); + + for ( int i = 0; i < destDesc.length; i++) { + // Do the actual comparision here + if(!(sourceDesc [i] .compare(Orderable.ORDER_OP_EQUALS,destDesc [i] ,true,true))) { + notEqual = true; + } + } + } + if(previousRow == null) + firstRow = true; + return notEqual; + } } ------------------------------------ Diff for ProjectRestrictResultSet ends -------------------------
        Hide
        Christian d'Heureuse added a comment -
        Show
        Christian d'Heureuse added a comment - This is the same bug as DERBY-3 ( http://issues.apache.org/jira/browse/DERBY-3 )

          People

          • Assignee:
            Unassigned
            Reporter:
            Tulika Agrawal
          • Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development