Derby
  1. Derby
  2. DERBY-1773

insertRow() and updateRow() fail with syntax error when column has an alias

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 10.2.1.6
    • Fix Version/s: 10.13.0.0
    • Component/s: JDBC
    • Urgency:
      Normal
    • Issue & fix info:
      Repro attached

      Description

      When the select query used in an updatable result set has column aliases, a syntax error is thrown when executing ResultSet.insertRow() and ResultSet.updateRow(). The problem is seen on embedded and client. Repro is attached.

      Exception in thread "main" ERROR 42X14: 'A1' is not a column in table or VTI 'APP.T'.
      at org.apache.derby.iapi.error.StandardException.newException(StandardException.java:316)
      at org.apache.derby.impl.sql.compile.ResultColumn.bindResultColumnByName(ResultColumn.java:677)
      at org.apache.derby.impl.sql.compile.ResultColumnList.bindResultColumnsByName(ResultColumnList.java:682)
      at org.apache.derby.impl.sql.compile.ResultSetNode.bindResultColumns(ResultSetNode.java:683)
      at org.apache.derby.impl.sql.compile.SelectNode.bindResultColumns(SelectNode.java:742)
      at org.apache.derby.impl.sql.compile.UpdateNode.bind(UpdateNode.java:349)
      at org.apache.derby.impl.sql.GenericStatement.prepMinion(GenericStatement.java:345)
      at org.apache.derby.impl.sql.GenericStatement.prepare(GenericStatement.java:111)
      at org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.prepareInternalStatement(GenericLanguageConnectionContext.java:723)
      at org.apache.derby.impl.jdbc.EmbedResultSet.updateRow(EmbedResultSet.java:3734)
      at Alias.main(Alias.java:15)

      1. anyVsAll.diff
        8 kB
        Bryan Pendleton
      2. ASF.LICENSE.NOT.GRANTED--Alias.java
        0.6 kB
        Knut Anders Hatlen
      3. fixedComments.diff
        8 kB
        Bryan Pendleton
      4. NoUpdatesToAliasedColumnsWithTest.diff
        7 kB
        Bryan Pendleton
      5. updatedToHeadMarch2010.diff
        7 kB
        Bryan Pendleton
      6. updatedToHeadMarch2016.diff
        7 kB
        Bryan Pendleton

        Issue Links

          Activity

          Hide
          Bryan Pendleton added a comment -

          I've completed the work I intended for this issue; resolving.

          Show
          Bryan Pendleton added a comment - I've completed the work I intended for this issue; resolving.
          Hide
          ASF subversion and git services added a comment -

          Commit 1734744 from Bryan Pendleton in branch 'code/trunk'
          [ https://svn.apache.org/r1734744 ]

          DERBY-1773: cursor updates fail with syntax error when column has an alias

          This change enhances the runtime column analysis code so that an updatable
          cursor can make a more nuanced decision about whether a column update is
          or is not allowed.

          Specifically, certain columns may not be updated, if they have been aliased.

          Prior to this change, a confusing syntax error message would be delivered
          when attempting to update an aliased column. Now, a more clear error message
          is delivered, pointing at the fact that the aliased column is not in the
          FOR UPDATE list of the cursor.

          So the net result is (at least, should be) that the same set of queries are
          accepted, but those that are not accepted have a slightly more clear message
          issued when they are detected.

          Show
          ASF subversion and git services added a comment - Commit 1734744 from Bryan Pendleton in branch 'code/trunk' [ https://svn.apache.org/r1734744 ] DERBY-1773 : cursor updates fail with syntax error when column has an alias This change enhances the runtime column analysis code so that an updatable cursor can make a more nuanced decision about whether a column update is or is not allowed. Specifically, certain columns may not be updated, if they have been aliased. Prior to this change, a confusing syntax error message would be delivered when attempting to update an aliased column. Now, a more clear error message is delivered, pointing at the fact that the aliased column is not in the FOR UPDATE list of the cursor. So the net result is (at least, should be) that the same set of queries are accepted, but those that are not accepted have a slightly more clear message issued when they are detected.
          Hide
          Bryan Pendleton added a comment -

          Attached 'fixedComments.diff' is essentially identical
          to the previous patch, but I updated some of the comments
          to be more accurate, and changed the string text in
          a SanityManager,Debug statement. So, no functional
          change, just cleanup.

          All my regression tests have been clean.

          Show
          Bryan Pendleton added a comment - Attached 'fixedComments.diff' is essentially identical to the previous patch, but I updated some of the comments to be more accurate, and changed the string text in a SanityManager,Debug statement. So, no functional change, just cleanup. All my regression tests have been clean.
          Hide
          Bryan Pendleton added a comment -

          Uhm, I guess I lost track of this issue, for it's been 6 years!

          The first thing I did was to bring the previous patch up to date,
          since a lot can change in 6 years. 'updatedToHeadMarch2016.diff'
          is the result, and it builds and passes tests with the head of trunk.

          Upon closely re-reading Knut Anders's comments from 6 years ago,
          I remembered that I had not addressed his observation that it
          would be possible for some, but not all, of the columns in the
          result list to be aliased, and only those columns which are aliased
          should be non-updatable; the unaliased columns should still be
          updatable.

          After thinking about that for a while, it seemed to me that this
          merely required changing the proposed ResultColumnList.columnsAreUpdatable()
          method so that it tested whether any columns in the list were
          updatable, as opposed to requiring that all columns in the
          list were updatable.

          So I did that, and added a few additional test cases, and the
          results appear to be satisfactory.

          The revised patch is attached as anyVsAll.diff.

          I'm running tests on it now, to see how it behaves.

          Sorry for the 6 year gap, but if anyone is still interested
          in this issue, feedback would be gratefully received.

          Show
          Bryan Pendleton added a comment - Uhm, I guess I lost track of this issue, for it's been 6 years! The first thing I did was to bring the previous patch up to date, since a lot can change in 6 years. 'updatedToHeadMarch2016.diff' is the result, and it builds and passes tests with the head of trunk. Upon closely re-reading Knut Anders's comments from 6 years ago, I remembered that I had not addressed his observation that it would be possible for some, but not all, of the columns in the result list to be aliased, and only those columns which are aliased should be non-updatable; the unaliased columns should still be updatable. After thinking about that for a while, it seemed to me that this merely required changing the proposed ResultColumnList.columnsAreUpdatable() method so that it tested whether any columns in the list were updatable, as opposed to requiring that all columns in the list were updatable. So I did that, and added a few additional test cases, and the results appear to be satisfactory. The revised patch is attached as anyVsAll.diff. I'm running tests on it now, to see how it behaves. Sorry for the 6 year gap, but if anyone is still interested in this issue, feedback would be gratefully received.
          Hide
          Bryan Pendleton added a comment -

          Just updated the patch to the latest trunk head to resolve a few merges, and re-attached it.

          Show
          Bryan Pendleton added a comment - Just updated the patch to the latest trunk head to resolve a few merges, and re-attached it.
          Hide
          Knut Anders Hatlen added a comment -

          My understanding is that "FOR UPDATE OF C1,C2" means that the cursor should be updatable, but you can only update the columns C1 and C2 using that cursor. If there's an alias on C1 or C2, the statement should fail. Aliases on other columns than C1 and C2 should be allowed, and the cursor should still be updatable if they are aliased. If I read the patch correctly, it will make the cursor read only as long as there is at least one column that's aliased. I don't think it's a big usability issue if we choose to disallow aliases altogether on updatable cursors, but that would require changes to the reference manual [1] and probably also warrant a release note, so I'd prefer not to impose that restriction.

          I agree that the two different ways of aliasing (C2 AS A2 vs T1 AS A(A1,A2)) are equivalent with regard to this issue and should be handled the same way.

          I also agree that FOR UPDATE with no column list, or an implicit FOR UPDATE clause (DERBY-231), means that all columns are updatable. Perhaps the alternative, more dynamic semantics are more convenient in some cases, but we already have syntax for making just a subset of the columns updatable, so I don't think it adds much value.

          [1] http://db.apache.org/derby/docs/dev/ref/rrefsqlj41360.html#rrefsqlj41360__sqlj15384

          Show
          Knut Anders Hatlen added a comment - My understanding is that "FOR UPDATE OF C1,C2" means that the cursor should be updatable, but you can only update the columns C1 and C2 using that cursor. If there's an alias on C1 or C2, the statement should fail. Aliases on other columns than C1 and C2 should be allowed, and the cursor should still be updatable if they are aliased. If I read the patch correctly, it will make the cursor read only as long as there is at least one column that's aliased. I don't think it's a big usability issue if we choose to disallow aliases altogether on updatable cursors, but that would require changes to the reference manual [1] and probably also warrant a release note, so I'd prefer not to impose that restriction. I agree that the two different ways of aliasing (C2 AS A2 vs T1 AS A(A1,A2)) are equivalent with regard to this issue and should be handled the same way. I also agree that FOR UPDATE with no column list, or an implicit FOR UPDATE clause ( DERBY-231 ), means that all columns are updatable. Perhaps the alternative, more dynamic semantics are more convenient in some cases, but we already have syntax for making just a subset of the columns updatable, so I don't think it adds much value. [1] http://db.apache.org/derby/docs/dev/ref/rrefsqlj41360.html#rrefsqlj41360__sqlj15384
          Hide
          Bryan Pendleton added a comment -

          Attached is a patch proposal. I'm still running tests, and would
          appreciate suggestions for more tests.

          This patch makes two basic changes:
          1) It modifies ResultColumnList.markUpdatableByCursor so that
          it only marks a ResultColumn as updatable if it is not aliased,
          that is, if the actual column name of its ColumnReference matches
          the ResultColumn name.
          2) It modifies SelectNode.isUpdatableCursor to call the new method
          FromTable.columnsAreUpdatable, which checks the columns to
          see if they have been aliased, in which case the select is NOT updatable.

          There are various aspects to this code that I'm not completely clear on.
          For one thing, I'm not clear on the differences between the following
          two techniques for introducing column aliasing:

          SELECT c1 as a1, c2 as a2 from t1 as abcde

          versus

          select * from t1 as a(a1,a2)

          They seem to take substantially different paths through the code and
          the resulting data structures are different, so the two basic changes that
          I made seemed necessary, as the first change handles the first statement,
          and the second change handles the second statement.

          Secondly, I'm not clear on the overall semantics of "the ResultSet is updatable"
          versus "this particular column is updatable". It seems like the FOR UPDATE OF
          clause allows for the specification of a particular subset of columns which
          are updatable, but one can also simply say "FOR UPDATE", or can in fact
          omit the clause entirely (this is DERBY-231), in which case the implication is,
          I guess, that all columns must be updatable in order for the ResultSet to be updatable?

          Or is it more dynamic? Is the intent that, if I say "FOR UPDATE OF C1,C2", then
          only C1 and C2 are updatable, but if I say just "FOR UPDATE" or omit it entirely,
          then so long as at least one column is updatable, the ResultSet should be updatable,
          and the computation of updatability must be deferred until the actual attempt to
          perform an update (by calling rs.updateXXX) is performed?

          As I said, it would be great to have suggestions for more test cases to add!

          Show
          Bryan Pendleton added a comment - Attached is a patch proposal. I'm still running tests, and would appreciate suggestions for more tests. This patch makes two basic changes: 1) It modifies ResultColumnList.markUpdatableByCursor so that it only marks a ResultColumn as updatable if it is not aliased, that is, if the actual column name of its ColumnReference matches the ResultColumn name. 2) It modifies SelectNode.isUpdatableCursor to call the new method FromTable.columnsAreUpdatable, which checks the columns to see if they have been aliased, in which case the select is NOT updatable. There are various aspects to this code that I'm not completely clear on. For one thing, I'm not clear on the differences between the following two techniques for introducing column aliasing: SELECT c1 as a1, c2 as a2 from t1 as abcde versus select * from t1 as a(a1,a2) They seem to take substantially different paths through the code and the resulting data structures are different, so the two basic changes that I made seemed necessary, as the first change handles the first statement, and the second change handles the second statement. Secondly, I'm not clear on the overall semantics of "the ResultSet is updatable" versus "this particular column is updatable". It seems like the FOR UPDATE OF clause allows for the specification of a particular subset of columns which are updatable, but one can also simply say "FOR UPDATE", or can in fact omit the clause entirely (this is DERBY-231 ), in which case the implication is, I guess, that all columns must be updatable in order for the ResultSet to be updatable? Or is it more dynamic? Is the intent that, if I say "FOR UPDATE OF C1,C2", then only C1 and C2 are updatable, but if I say just "FOR UPDATE" or omit it entirely, then so long as at least one column is updatable, the ResultSet should be updatable, and the computation of updatability must be deferred until the actual attempt to perform an update (by calling rs.updateXXX) is performed? As I said, it would be great to have suggestions for more test cases to add!
          Hide
          Bryan Pendleton added a comment -

          Thanks Knut, I think you are exactly right. I'll pursue the "refuse it with an error"
          approach, and I'll try to include test cases both with and without FOR UPDATE.

          Thanks for the pointer to DERBY-231, the behavior is making more sense now.

          Show
          Bryan Pendleton added a comment - Thanks Knut, I think you are exactly right. I'll pursue the "refuse it with an error" approach, and I'll try to include test cases both with and without FOR UPDATE. Thanks for the pointer to DERBY-231 , the behavior is making more sense now.
          Hide
          Knut Anders Hatlen added a comment -

          It made no difference to add FOR UPDATE to the repro. When I added FOR UPDATE OF X, I got NPE/assert failure.

          Show
          Knut Anders Hatlen added a comment - It made no difference to add FOR UPDATE to the repro. When I added FOR UPDATE OF X, I got NPE/assert failure.
          Hide
          Knut Anders Hatlen added a comment -

          Could the difference be explicit vs implicit FOR UPDATE clause? Ref manual says that correlation name cannot be used on updatable columns. http://db.apache.org/derby/docs/dev/ref/rrefcorrelationname.html

          Updatable result sets used to require FOR UPDATE until DERBY-231. The test case in UpdatableResultSetTest appended "FOR UPDATE of c1" to the select statement, whereas the repro doesn't use FOR UPDATE.

          Show
          Knut Anders Hatlen added a comment - Could the difference be explicit vs implicit FOR UPDATE clause? Ref manual says that correlation name cannot be used on updatable columns. http://db.apache.org/derby/docs/dev/ref/rrefcorrelationname.html Updatable result sets used to require FOR UPDATE until DERBY-231 . The test case in UpdatableResultSetTest appended "FOR UPDATE of c1" to the select statement, whereas the repro doesn't use FOR UPDATE.
          Hide
          Bryan Pendleton added a comment -

          Interestingly, UpdatableResultSetTest.testDeleteRowWithCorrelationForColumnName()
          has some very similar, but not identical, scenarios.

          That test case appears to state that updating a column which has an alias should
          not be allowed.

          I'll study this test case in more detail, and try to figure out why we refuse the statement
          in that case, but not in the scenario from the repro script.

          Show
          Bryan Pendleton added a comment - Interestingly, UpdatableResultSetTest.testDeleteRowWithCorrelationForColumnName() has some very similar, but not identical, scenarios. That test case appears to state that updating a column which has an alias should not be allowed. I'll study this test case in more detail, and try to figure out why we refuse the statement in that case, but not in the scenario from the repro script.
          Hide
          Bryan Pendleton added a comment -

          I've been investigating this issue a bit, and want to note
          what I've found, as well as several ideas.

          Firstly, the core of the issue occurs in EmbedResultSet.updateRow, which is
          attempting to construct and execute, on the fly, a statement like:

          update t set x = 2 where current of my-cursor-name

          The problem arises due to the fact that the select statement:

          select * from t as a(a1)

          has introduced aliases for both the table (T => A) and the column (X => A1)

          Now, when EmbedResultSet.updateRow processes the table, it is careful
          to use the true underlying base table name (T):

          updateWhereCurrentOfSQL.append(getFullBaseTableName(targetTable));//got the underlying (schema.)table name

          The problem arises a bit later, when it tries to access the columns which
          are being updated:

          for (int i=1; i<=resultDescription.getColumnCount(); i++) { //in this for loop we are constructing columnname=?,... part of the update sql
          if (columnGotUpdated[i-1])

          { //if the column got updated, do following if (foundOneColumnAlready) updateWhereCurrentOfSQL.append(","); //using quotes around the column name to preserve case sensitivity updateWhereCurrentOfSQL.append(IdUtil.normalToDelimited( resultDescription.getColumnDescriptor(i).getName()) + "=?"); foundOneColumnAlready = true; }

          }

          The issue here is that resultDescription.getColumnDescriptor is a
          GenericColumnDescriptor, which was constructed from a ResultColumnDescriptor
          by the method ResultColumnList.makeResultDescriptors(), and this code doesn't
          have enough "smarts" to understand the difference between the column's "exposedName" (A1)
          and its true base column name (X).

          So, I can see two possibilities:
          1) Enhance GenericColumnDescriptor, ResultColumnDescriptor, and (probably) ResultColumn,
          so that we have a new method GenericColumnDescriptor.getBaseColumnName(), and call
          that new method from EmbedResultSet.updateRow.
          2) Refuse the statement. Change (probably) SelectNode.isUpdatableCursor() so that it
          looks through its ResultColumnList to see if any of the result columns have been aliased,
          and state that a SELECT with aliased column names is NOT a legal updatable SELECT statement.

          I'll investigate both possibilities further, just wanted to note my results so far.

          Show
          Bryan Pendleton added a comment - I've been investigating this issue a bit, and want to note what I've found, as well as several ideas. Firstly, the core of the issue occurs in EmbedResultSet.updateRow, which is attempting to construct and execute, on the fly, a statement like: update t set x = 2 where current of my-cursor-name The problem arises due to the fact that the select statement: select * from t as a(a1) has introduced aliases for both the table (T => A) and the column (X => A1) Now, when EmbedResultSet.updateRow processes the table, it is careful to use the true underlying base table name (T): updateWhereCurrentOfSQL.append(getFullBaseTableName(targetTable));//got the underlying (schema.)table name The problem arises a bit later, when it tries to access the columns which are being updated: for (int i=1; i<=resultDescription.getColumnCount(); i++) { //in this for loop we are constructing columnname=?,... part of the update sql if (columnGotUpdated [i-1] ) { //if the column got updated, do following if (foundOneColumnAlready) updateWhereCurrentOfSQL.append(","); //using quotes around the column name to preserve case sensitivity updateWhereCurrentOfSQL.append(IdUtil.normalToDelimited( resultDescription.getColumnDescriptor(i).getName()) + "=?"); foundOneColumnAlready = true; } } The issue here is that resultDescription.getColumnDescriptor is a GenericColumnDescriptor, which was constructed from a ResultColumnDescriptor by the method ResultColumnList.makeResultDescriptors(), and this code doesn't have enough "smarts" to understand the difference between the column's "exposedName" (A1) and its true base column name (X). So, I can see two possibilities: 1) Enhance GenericColumnDescriptor, ResultColumnDescriptor, and (probably) ResultColumn, so that we have a new method GenericColumnDescriptor.getBaseColumnName(), and call that new method from EmbedResultSet.updateRow. 2) Refuse the statement. Change (probably) SelectNode.isUpdatableCursor() so that it looks through its ResultColumnList to see if any of the result columns have been aliased, and state that a SELECT with aliased column names is NOT a legal updatable SELECT statement. I'll investigate both possibilities further, just wanted to note my results so far.
          Hide
          Knut Anders Hatlen added a comment -

          Triaged for 10.5.2.

          Show
          Knut Anders Hatlen added a comment - Triaged for 10.5.2.

            People

            • Assignee:
              Bryan Pendleton
              Reporter:
              Knut Anders Hatlen
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development