Derby
  1. Derby
  2. DERBY-6003

Create row templates outside of the generated code

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 10.10.1.1
    • Fix Version/s: 10.10.1.1
    • Component/s: SQL
    • Labels:
      None

      Description

      The constructors for many of the result set classes take GeneratedMethod parameters that create row templates (an ExecRow of a certain size and column types, each column initialized to an SQL null value).

      As an alternative, the compiler could produce an ExecRow instance and put it into the savedObjects field of GenericPreparedStatement, and the constructors could take parameter that points to the object in savedObjects. Where the result sets currently invoke the generated method to produce a fresh template, they could instead clone the saved object.

      Advantages with the suggested approach would be:

      • Reduce the size of the code generator, which should reduce total code complexity.
      • Reduce the amount of generated code, which makes it easier for tools (profilers, static code analyzers, IDEs) to map executable code to source code.
      • Reduce the actual number of generated methods, which makes it less likely that queries need to use reflection to invoke the remaining generated methods (there's a switchover from DirectCall to ReflectCall when the number of generated methods exceeds 10).
      1. d6003-6a-index-to-base-row.diff
        18 kB
        Knut Anders Hatlen
      2. d6003-5a-sort-vti-aggregate-window.diff
        36 kB
        Knut Anders Hatlen
      3. d6003-3c-downgrade-with-stored-proc.diff
        11 kB
        Knut Anders Hatlen
      4. d6003-3b-downgrade-workaround-in-tests.diff
        11 kB
        Knut Anders Hatlen
      5. d6003-4a-scanresultset.diff
        76 kB
        Knut Anders Hatlen
      6. d6003-3a-safe-downgrade.diff
        9 kB
        Knut Anders Hatlen
      7. d6003-2a-unused-field.diff
        6 kB
        Knut Anders Hatlen
      8. d6003-1a-cleanup.diff
        10 kB
        Knut Anders Hatlen

        Issue Links

          Activity

          Knut Anders Hatlen created issue -
          Hide
          Knut Anders Hatlen added a comment -

          Many of the GeneratedMethod fields in the result set classes are not actually used. To make it easier to look into this issue, I've attached a patch (d6003-1a-cleanup.diff) that removes those unused fields. It also removes some other fields that I found were unused, as well as unused imports.

          All tests ran cleanly, so I committed the patch with revision 1412763.

          Show
          Knut Anders Hatlen added a comment - Many of the GeneratedMethod fields in the result set classes are not actually used. To make it easier to look into this issue, I've attached a patch (d6003-1a-cleanup.diff) that removes those unused fields. It also removes some other fields that I found were unused, as well as unused imports. All tests ran cleanly, so I committed the patch with revision 1412763.
          Knut Anders Hatlen made changes -
          Field Original Value New Value
          Attachment d6003-1a-cleanup.diff [ 12554717 ]
          Hide
          Knut Anders Hatlen added a comment -

          I first said the generated method created and initialized the row templates. That's not quite right. The row template is created in the generated constructor, and the generated method (re-)initializes it and returns a reference to it.

          I have a mostly working patch in my sandbox. However, the original plan of storing the ExecRow in GenericPreparedStatement.savedObjects didn't work as smoothly as I'd hoped. Anything stored in savedObjects has to be serializable, which ExecRow isn't, and making it serializable is not straightforward. It must be serializable so that it can be used in stored prepared statements. We might make ExecRow serializable, but the template ExecRow contains SQL null values, and the writeExternal() and readExternal() methods of the SQL type classes do not work with null, as saving/restoring null values is usually handled at a higher level in the code. The approach used at the higher level actually depends on having a template row to handle nulls, so we run into a chicken-and-egg problem if we try to use the same approach to store the template row.

          I don't think it's an option to change the writeExternal() and readExternal() methods of all the SQL type classes so they handle null. Although it is possible to change the stored format of the data types, that will require too much upgrade logic (and testing) to be worthwhile.

          The alternative that I'm exploring now, is to store an array of DataTypeDescriptors from which the result sets can create the ExecRow. DataTypeDescriptor is already serializable, and it has a getNull() method which can be used to create empty SQL values of the right type that we can put in the ExecRow template. There are some problems with that approach too:

          • RowLocation values don't have a corresponding DataTypeDescriptor, so they'll need some special handling. The code that generates the byte code already has a special case for row locations, so the new code shouldn't be much worse than the current code.
          • DataTypeDescriptor.readExternal() is not able to read the type descriptor for a UDT written by DataTypeDescriptor.writeExternal(). It doesn't fail, but some of the state of the original type descriptor isn't restored, which leads to subsequent NullPointerExceptions. But it looks like writeExternal() does write all necessary information to restore the fields for UDTs too, so I think the bug should be fixable, even without changing the stored format of DataTypeDescriptor.
          Show
          Knut Anders Hatlen added a comment - I first said the generated method created and initialized the row templates. That's not quite right. The row template is created in the generated constructor, and the generated method (re-)initializes it and returns a reference to it. I have a mostly working patch in my sandbox. However, the original plan of storing the ExecRow in GenericPreparedStatement.savedObjects didn't work as smoothly as I'd hoped. Anything stored in savedObjects has to be serializable, which ExecRow isn't, and making it serializable is not straightforward. It must be serializable so that it can be used in stored prepared statements. We might make ExecRow serializable, but the template ExecRow contains SQL null values, and the writeExternal() and readExternal() methods of the SQL type classes do not work with null, as saving/restoring null values is usually handled at a higher level in the code. The approach used at the higher level actually depends on having a template row to handle nulls, so we run into a chicken-and-egg problem if we try to use the same approach to store the template row. I don't think it's an option to change the writeExternal() and readExternal() methods of all the SQL type classes so they handle null. Although it is possible to change the stored format of the data types, that will require too much upgrade logic (and testing) to be worthwhile. The alternative that I'm exploring now, is to store an array of DataTypeDescriptors from which the result sets can create the ExecRow. DataTypeDescriptor is already serializable, and it has a getNull() method which can be used to create empty SQL values of the right type that we can put in the ExecRow template. There are some problems with that approach too: RowLocation values don't have a corresponding DataTypeDescriptor, so they'll need some special handling. The code that generates the byte code already has a special case for row locations, so the new code shouldn't be much worse than the current code. DataTypeDescriptor.readExternal() is not able to read the type descriptor for a UDT written by DataTypeDescriptor.writeExternal(). It doesn't fail, but some of the state of the original type descriptor isn't restored, which leads to subsequent NullPointerExceptions. But it looks like writeExternal() does write all necessary information to restore the fields for UDTs too, so I think the bug should be fixable, even without changing the stored format of DataTypeDescriptor.
          Knut Anders Hatlen made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          Hide
          Knut Anders Hatlen added a comment -

          To fix the above mentioned deserialization problem in DataTypeDescriptor, I think we need to make DataTypeDescriptor.readExternal() call TypeId.getUserDefinedTypeId() instead of TypeId.getBuiltInTypeId() if it's a user-defined type.

          The getUserDefinedTypeId() takes a boolean parameter called delimitedIdentifier. I don't know what's the right value to pass in. But looking closer at it, it looks like the value is never used, and all callers pass in false. So to make the final patch simpler, I'm attaching another cleanup patch (d6003-2a-unused-field.diff) that removes this dead code. That should be an improvement regardless of how we end up fixing the rest of the issue.

          All tests ran cleanly with the 2a patch.

          Show
          Knut Anders Hatlen added a comment - To fix the above mentioned deserialization problem in DataTypeDescriptor, I think we need to make DataTypeDescriptor.readExternal() call TypeId.getUserDefinedTypeId() instead of TypeId.getBuiltInTypeId() if it's a user-defined type. The getUserDefinedTypeId() takes a boolean parameter called delimitedIdentifier. I don't know what's the right value to pass in. But looking closer at it, it looks like the value is never used, and all callers pass in false. So to make the final patch simpler, I'm attaching another cleanup patch (d6003-2a-unused-field.diff) that removes this dead code. That should be an improvement regardless of how we end up fixing the rest of the issue. All tests ran cleanly with the 2a patch.
          Knut Anders Hatlen made changes -
          Attachment d6003-2a-unused-field.diff [ 12555011 ]
          Knut Anders Hatlen made changes -
          Issue & fix info Patch Available [ 10102 ]
          Hide
          Knut Anders Hatlen added a comment -

          Committed 2a to trunk with revision 1414718.

          Show
          Knut Anders Hatlen added a comment - Committed 2a to trunk with revision 1414718.
          Knut Anders Hatlen made changes -
          Issue & fix info Patch Available [ 10102 ]
          Hide
          Knut Anders Hatlen added a comment -

          The next speed bump is deserialization issues on downgrade after soft upgrade.

          The problem: Some old versions (those that don't have the fix for DERBY-3870 or DERBY-5289) read all of SYS.SYSSTATEMENTS when detecting a version change, including the column that holds the compiled plan. So if the format of the compiled plan changes between versions in a way so that the old release cannot deserialize plans created by the newer version, booting the database with the old version after soft upgrade may fail with deserialization errors.

          DERBY-3870 and DERBY-5289 fixed this problem on upgrade, but left the downgrade problem unresolved. For downgrade, they simply disabled the problematic test cases in the upgrade tests.

          Since I intend to change the format of the stored plans as part of this issue, databases that use triggers will run into this problem again if they are soft-upgraded from one of the broken versions, and then later downgraded back to that version. So some handling of the downgrade problem is needed. Otherwise, the upgrade tests that use triggers will fail when the old version is one of the broken versions, and also users that are still on one of the broken versions, may run into the problem if they do soft upgrade followed by downgrade.

          One way to handle this might be to disable trigger test cases in upgrade tests when we are using one of the broken versions. That won't fix the problem seen by users, though. Also, it might require extra disabling logic in future test cases that use triggers, so it doesn't even fix the testing once and for all.

          An alternative I'm exploring now, and that looks promising, is to stop writing the compiled plan of an SPS to disk if we are running in soft upgrade mode and the dictionary version is so old that it is possible to downgrade to a version that might fail to boot because of the new plan. We already do this for meta-data queries (actually even stricter: we never store plans for meta-data queries in soft upgrade).

          The only downside I can think of with this approach, is that trigger statements need to be recompiled after each reboot when running in soft upgrade from one of the affected branches. This should happen automatically, so users shouldn't notice much except, perhaps, that warmup takes a little longer after a reboot if they have many triggers.

          I think that sounds like a reasonable cost to avoid the deserialization errors. If users experience performance degradation, they can always do a full upgrade and the trigger plans will be stored to disk again.

          By the way, this approach was briefly discussed in DERBY-5105, but it was skipped because DERBY-5105 was so limited that a workaround in the upgrade tests was considered enough. If we implement it, the workaround for DERBY-5105, as well as the workarounds for DERBY-4835, DERBY-5263 and DERBY-5289, can probably be removed from the upgrade tests.

          Show
          Knut Anders Hatlen added a comment - The next speed bump is deserialization issues on downgrade after soft upgrade. The problem: Some old versions (those that don't have the fix for DERBY-3870 or DERBY-5289 ) read all of SYS.SYSSTATEMENTS when detecting a version change, including the column that holds the compiled plan. So if the format of the compiled plan changes between versions in a way so that the old release cannot deserialize plans created by the newer version, booting the database with the old version after soft upgrade may fail with deserialization errors. DERBY-3870 and DERBY-5289 fixed this problem on upgrade, but left the downgrade problem unresolved. For downgrade, they simply disabled the problematic test cases in the upgrade tests. Since I intend to change the format of the stored plans as part of this issue, databases that use triggers will run into this problem again if they are soft-upgraded from one of the broken versions, and then later downgraded back to that version. So some handling of the downgrade problem is needed. Otherwise, the upgrade tests that use triggers will fail when the old version is one of the broken versions, and also users that are still on one of the broken versions, may run into the problem if they do soft upgrade followed by downgrade. One way to handle this might be to disable trigger test cases in upgrade tests when we are using one of the broken versions. That won't fix the problem seen by users, though. Also, it might require extra disabling logic in future test cases that use triggers, so it doesn't even fix the testing once and for all. An alternative I'm exploring now, and that looks promising, is to stop writing the compiled plan of an SPS to disk if we are running in soft upgrade mode and the dictionary version is so old that it is possible to downgrade to a version that might fail to boot because of the new plan. We already do this for meta-data queries (actually even stricter: we never store plans for meta-data queries in soft upgrade). The only downside I can think of with this approach, is that trigger statements need to be recompiled after each reboot when running in soft upgrade from one of the affected branches. This should happen automatically, so users shouldn't notice much except, perhaps, that warmup takes a little longer after a reboot if they have many triggers. I think that sounds like a reasonable cost to avoid the deserialization errors. If users experience performance degradation, they can always do a full upgrade and the trigger plans will be stored to disk again. By the way, this approach was briefly discussed in DERBY-5105 , but it was skipped because DERBY-5105 was so limited that a workaround in the upgrade tests was considered enough. If we implement it, the workaround for DERBY-5105 , as well as the workarounds for DERBY-4835 , DERBY-5263 and DERBY-5289 , can probably be removed from the upgrade tests.
          Hide
          Knut Anders Hatlen added a comment -

          Attaching d6003-3a-safe-downgrade.diff which makes the suggested changes to allow safe downgrade after the format of stored plans has changed.

          Details:

          SPSDescriptor: Added a method that checks if it is safe to store the plan. It is considered safe if the dictionary version is 10.9 or higher, since all released versions on the 10.9 branch have the fixes for DERBY-3870 and DERBY-4835.

          SYSSTATEMENTSRowFactory: When creating a row to store in SYSSTATEMENTS, use the new method in SPSDescriptor to check if it's safe to store the plan. If it is not, return a row that has the VALID column set to false and the CONSTANTSTATE column (which is where the plan is stored) to null. Changing those two columns is what the clearSPSPlans() method does too when the version has changed.

          DataDictionaryImpl: When a new database is created, set the dictionary version before creating the system tables. Otherwise, the version check in SPSDescriptor will fail with a NullPointerException when the SYSSTATEMENTS table is created.

          BasicSetup: Removed the workarounds for DERBY-4835, DERBY-5105, DERBY-5263 and DERBY-5289, as this fix makes downgrade successful for some test cases that used to fail.

          Show
          Knut Anders Hatlen added a comment - Attaching d6003-3a-safe-downgrade.diff which makes the suggested changes to allow safe downgrade after the format of stored plans has changed. Details: SPSDescriptor: Added a method that checks if it is safe to store the plan. It is considered safe if the dictionary version is 10.9 or higher, since all released versions on the 10.9 branch have the fixes for DERBY-3870 and DERBY-4835 . SYSSTATEMENTSRowFactory: When creating a row to store in SYSSTATEMENTS, use the new method in SPSDescriptor to check if it's safe to store the plan. If it is not, return a row that has the VALID column set to false and the CONSTANTSTATE column (which is where the plan is stored) to null. Changing those two columns is what the clearSPSPlans() method does too when the version has changed. DataDictionaryImpl: When a new database is created, set the dictionary version before creating the system tables. Otherwise, the version check in SPSDescriptor will fail with a NullPointerException when the SYSSTATEMENTS table is created. BasicSetup: Removed the workarounds for DERBY-4835 , DERBY-5105 , DERBY-5263 and DERBY-5289 , as this fix makes downgrade successful for some test cases that used to fail.
          Knut Anders Hatlen made changes -
          Attachment d6003-3a-safe-downgrade.diff [ 12555505 ]
          Hide
          Knut Anders Hatlen added a comment -

          All the regression tests ran cleanly with the 3a patch.

          Show
          Knut Anders Hatlen added a comment - All the regression tests ran cleanly with the 3a patch.
          Knut Anders Hatlen made changes -
          Issue & fix info Patch Available [ 10102 ]
          Hide
          Mike Matrigali added a comment -

          Changes look good to me.

          With respect to downgrade issue, I wonder if it is necesary. I don't understand the performance impact, but a little concerned that we are penalizing a lot of users that will not see the problem.
          At least in the cases where an apache release exists that fixes the bugs with usual handling of upgrade and downgrade, should we still disable the trigger performance because user may
          downgrade to a release known release with more bugs, ie. say db version is 10.8 and latest 10.8 release has all fixes.

          Also to understand the range of your changes, do you plan on backporting the downgrade changes, or will these just be in 10.10 software and above?

          I understand it helps with the tests. In general I think we expect to people to at least be able to software upgrade to the latest bug fix release on a given branch. As we decided previously
          it was reasonable if version X.1 could not boot because of bug that user could could use X.2. Now if there are no available apache releases for a given version your fix seems like an excellent
          help to those users stuck at that release.

          Show
          Mike Matrigali added a comment - Changes look good to me. With respect to downgrade issue, I wonder if it is necesary. I don't understand the performance impact, but a little concerned that we are penalizing a lot of users that will not see the problem. At least in the cases where an apache release exists that fixes the bugs with usual handling of upgrade and downgrade, should we still disable the trigger performance because user may downgrade to a release known release with more bugs, ie. say db version is 10.8 and latest 10.8 release has all fixes. Also to understand the range of your changes, do you plan on backporting the downgrade changes, or will these just be in 10.10 software and above? I understand it helps with the tests. In general I think we expect to people to at least be able to software upgrade to the latest bug fix release on a given branch. As we decided previously it was reasonable if version X.1 could not boot because of bug that user could could use X.2. Now if there are no available apache releases for a given version your fix seems like an excellent help to those users stuck at that release.
          Hide
          Knut Anders Hatlen added a comment -

          Thanks for looking at the patch, Mike.

          I was not planning to backport this change, only commit it to trunk. However, if there are concerns that this fix may be to broad, I can always add workarounds to the upgrade tests instead.

          There are no official releases with the necessary fixes for 10.5, 10.6 or 10.7. Not for the earlier branches either, but it doesn't appear to be a problem on those branches, probably because DERBY-1107 was not introduced until 10.5. On 10.8 it was fixed in a maintenance release, and 10.9 included the fixes in the first official release.

          Show
          Knut Anders Hatlen added a comment - Thanks for looking at the patch, Mike. I was not planning to backport this change, only commit it to trunk. However, if there are concerns that this fix may be to broad, I can always add workarounds to the upgrade tests instead. There are no official releases with the necessary fixes for 10.5, 10.6 or 10.7. Not for the earlier branches either, but it doesn't appear to be a problem on those branches, probably because DERBY-1107 was not introduced until 10.5. On 10.8 it was fixed in a maintenance release, and 10.9 included the fixes in the first official release.
          Hide
          Knut Anders Hatlen added a comment -

          Attaching d6003-4a-scanresultset.diff which removes the generated
          methods that create row templates for the ScanResultSet tree. It
          currently depends on the 3a patch in order to make the upgrade tests
          pass.

          There still are other result set classes that use generated methods to
          produce row templates, so the code that generates the code is not
          removed in this patch. I'll post other patches later to fix the rest
          of the result set tree and remove the generated code.

          This patch adds a new class, called ExecRowBuilder, which can be used
          to produce and reset candidate rows used by the scan result sets. Its
          build() method creates an instance of the correct sub-type of ExecRow
          with SQL null values in the columns used by the scan. This is what was
          previously done in the generated constructor for the activation. Its
          reset() method replaces the existing values in the row with fresh
          instances that represent SQL null of the correct type, which is the
          same as the generated methods previously did.

          Instead of passing a reference to the generated method into the
          ScanResultSet constructor, the compiler now stores an ExecRowBuilder
          in GenericPreparedStatement.savedObjects and passes the array index as
          an argument to the constructor.

          Details:

          • java/engine/org/apache/derby/iapi/sql/execute/ExecRowBuilder.java

          New class, as described above.

          • java/engine/org/apache/derby/iapi/types/DataTypeDescriptor.java

          As mentioned in an earlier comment, DTD's deserialization wouldn't
          fully restore the state of the original instance if it represented a
          user-defined type. Fixed by using TypeId.getUserDefinedTypeId()
          instead of TypeId.getBuiltInTypeId() if the type is user-defined.

          This was needed in order to get ExecRowBuilder to deserialize
          successfully when the query accessed UDTs.

          • java/engine/org/apache/derby/impl/sql/compile/ResultColumnList.java

          Added a method buildRowTemplate() that creates an ExecRowBuilder of
          the right shape. The logic is essentially the same as in the existing
          ResultColumnList.generateHolder() method, except that generateHolder()
          produces a byte-code representation of how to produce the row, and
          buildRowTemplate() produces a Java data structure holding the same
          information.

          Factored out shared logic from generateHolder() and buildRowTemplate()
          into newRowLocationTemplate().

          • java/engine/org/apache/derby/impl/sql/compile/FromBaseTable.java

          Use RCL.buildRowTemplate() instead of RCL.generateHolder() when
          creating constructor arguments for a sub-class of ScanResultSet.

          • java/engine/org/apache/derby/iapi/sql/compile/JoinStrategy.java
          • java/engine/org/apache/derby/impl/sql/compile/BaseJoinStrategy.java
          • java/engine/org/apache/derby/impl/sql/compile/HashJoinStrategy.java
          • java/engine/org/apache/derby/impl/sql/compile/NestedLoopJoinStrategy.java

          Changed signatures to allow passing array index into GPS.savedObjects
          instead of generated method as scan parameter.

          • java/engine/org/apache/derby/impl/sql/execute/ScanResultSet.java

          Create the candidate row using the passed-in ExecRowBuilder, and also
          use the ExecRowBuilder to reset the candidate row.

          • java/engine/org/apache/derby/impl/sql/execute/TableScanResultSet.java

          Changed signature of constructor to index into GPS.savedObjects
          instead of generated method. Use the ExecRowBuilder to reset the
          candidate row.

          • java/engine/org/apache/derby/iapi/sql/execute/ResultSetFactory.java
          • java/engine/org/apache/derby/impl/sql/execute/GenericResultSetFactory.java
          • java/engine/org/apache/derby/impl/sql/execute/BulkTableScanResultSet.java
          • java/engine/org/apache/derby/impl/sql/execute/DependentResultSet.java
          • java/engine/org/apache/derby/impl/sql/execute/DistinctScanResultSet.java
          • java/engine/org/apache/derby/impl/sql/execute/LastIndexKeyResultSet.java
          • java/engine/org/apache/derby/impl/sql/execute/MultiProbeTableScanResultSet.java

          More signature changes to allow passing the GPS.savedObjects index
          instead of the generated method.

          • java/engine/org/apache/derby/impl/sql/compile/InsertNode.java
          • java/engine/org/apache/derby/impl/sql/execute/InsertResultSet.java

          Bulk insert creates a BulkTableScanResultSet at run-time and needs a
          way to pass a row template to the scan. Previously, it did this by
          wrapping a row instance in a class that implemented the
          GeneratedMethod interface, although it didn't actually represent a
          generated method. Now, since BulkTableScanResultSet expects an index
          into GPS.savedObjects, we instead save an ExecRowBuilder instance when
          compiling the bulk insert, and let InsertResultSet pass the array
          index to getBulkTableScanResultSet().

          InsertNode also needed a fix in requestBulkInsert(). It does some bind
          logic in the optimize phase (legitimately, because of DERBY-4789), but
          forgets to update the bulkInsert field which is usually set during
          bind. We now set bulkInsert to true also when this method is called,
          so that we can use the field in the code generation phase.

          • java/engine/org/apache/derby/impl/sql/compile/DeleteNode.java
          • java/engine/org/apache/derby/impl/sql/compile/UpdateNode.java
          • java/engine/org/apache/derby/impl/sql/execute/DeleteConstantAction.java
          • java/engine/org/apache/derby/impl/sql/execute/GenericConstantActionFactory.java
          • java/engine/org/apache/derby/impl/sql/execute/InsertConstantAction.java
          • java/engine/org/apache/derby/impl/sql/execute/UpdatableVTIConstantAction.java
          • java/engine/org/apache/derby/impl/sql/execute/WriteCursorConstantAction.java

          Since InsertResultSet now gets its row template added to the plan at
          compile-time, remove the logic to add it to and get it from the
          constant action. This logic was also present for update and delete,
          although it was only used in bulk insert. Remove for update and delete
          too.

          Show
          Knut Anders Hatlen added a comment - Attaching d6003-4a-scanresultset.diff which removes the generated methods that create row templates for the ScanResultSet tree. It currently depends on the 3a patch in order to make the upgrade tests pass. There still are other result set classes that use generated methods to produce row templates, so the code that generates the code is not removed in this patch. I'll post other patches later to fix the rest of the result set tree and remove the generated code. This patch adds a new class, called ExecRowBuilder, which can be used to produce and reset candidate rows used by the scan result sets. Its build() method creates an instance of the correct sub-type of ExecRow with SQL null values in the columns used by the scan. This is what was previously done in the generated constructor for the activation. Its reset() method replaces the existing values in the row with fresh instances that represent SQL null of the correct type, which is the same as the generated methods previously did. Instead of passing a reference to the generated method into the ScanResultSet constructor, the compiler now stores an ExecRowBuilder in GenericPreparedStatement.savedObjects and passes the array index as an argument to the constructor. Details: java/engine/org/apache/derby/iapi/sql/execute/ExecRowBuilder.java New class, as described above. java/engine/org/apache/derby/iapi/types/DataTypeDescriptor.java As mentioned in an earlier comment, DTD's deserialization wouldn't fully restore the state of the original instance if it represented a user-defined type. Fixed by using TypeId.getUserDefinedTypeId() instead of TypeId.getBuiltInTypeId() if the type is user-defined. This was needed in order to get ExecRowBuilder to deserialize successfully when the query accessed UDTs. java/engine/org/apache/derby/impl/sql/compile/ResultColumnList.java Added a method buildRowTemplate() that creates an ExecRowBuilder of the right shape. The logic is essentially the same as in the existing ResultColumnList.generateHolder() method, except that generateHolder() produces a byte-code representation of how to produce the row, and buildRowTemplate() produces a Java data structure holding the same information. Factored out shared logic from generateHolder() and buildRowTemplate() into newRowLocationTemplate(). java/engine/org/apache/derby/impl/sql/compile/FromBaseTable.java Use RCL.buildRowTemplate() instead of RCL.generateHolder() when creating constructor arguments for a sub-class of ScanResultSet. java/engine/org/apache/derby/iapi/sql/compile/JoinStrategy.java java/engine/org/apache/derby/impl/sql/compile/BaseJoinStrategy.java java/engine/org/apache/derby/impl/sql/compile/HashJoinStrategy.java java/engine/org/apache/derby/impl/sql/compile/NestedLoopJoinStrategy.java Changed signatures to allow passing array index into GPS.savedObjects instead of generated method as scan parameter. java/engine/org/apache/derby/impl/sql/execute/ScanResultSet.java Create the candidate row using the passed-in ExecRowBuilder, and also use the ExecRowBuilder to reset the candidate row. java/engine/org/apache/derby/impl/sql/execute/TableScanResultSet.java Changed signature of constructor to index into GPS.savedObjects instead of generated method. Use the ExecRowBuilder to reset the candidate row. java/engine/org/apache/derby/iapi/sql/execute/ResultSetFactory.java java/engine/org/apache/derby/impl/sql/execute/GenericResultSetFactory.java java/engine/org/apache/derby/impl/sql/execute/BulkTableScanResultSet.java java/engine/org/apache/derby/impl/sql/execute/DependentResultSet.java java/engine/org/apache/derby/impl/sql/execute/DistinctScanResultSet.java java/engine/org/apache/derby/impl/sql/execute/LastIndexKeyResultSet.java java/engine/org/apache/derby/impl/sql/execute/MultiProbeTableScanResultSet.java More signature changes to allow passing the GPS.savedObjects index instead of the generated method. java/engine/org/apache/derby/impl/sql/compile/InsertNode.java java/engine/org/apache/derby/impl/sql/execute/InsertResultSet.java Bulk insert creates a BulkTableScanResultSet at run-time and needs a way to pass a row template to the scan. Previously, it did this by wrapping a row instance in a class that implemented the GeneratedMethod interface, although it didn't actually represent a generated method. Now, since BulkTableScanResultSet expects an index into GPS.savedObjects, we instead save an ExecRowBuilder instance when compiling the bulk insert, and let InsertResultSet pass the array index to getBulkTableScanResultSet(). InsertNode also needed a fix in requestBulkInsert(). It does some bind logic in the optimize phase (legitimately, because of DERBY-4789 ), but forgets to update the bulkInsert field which is usually set during bind. We now set bulkInsert to true also when this method is called, so that we can use the field in the code generation phase. java/engine/org/apache/derby/impl/sql/compile/DeleteNode.java java/engine/org/apache/derby/impl/sql/compile/UpdateNode.java java/engine/org/apache/derby/impl/sql/execute/DeleteConstantAction.java java/engine/org/apache/derby/impl/sql/execute/GenericConstantActionFactory.java java/engine/org/apache/derby/impl/sql/execute/InsertConstantAction.java java/engine/org/apache/derby/impl/sql/execute/UpdatableVTIConstantAction.java java/engine/org/apache/derby/impl/sql/execute/WriteCursorConstantAction.java Since InsertResultSet now gets its row template added to the plan at compile-time, remove the logic to add it to and get it from the constant action. This logic was also present for update and delete, although it was only used in bulk insert. Remove for update and delete too.
          Knut Anders Hatlen made changes -
          Attachment d6003-4a-scanresultset.diff [ 12555776 ]
          Hide
          Knut Anders Hatlen added a comment -

          Attaching a new patch (d6003-3b-downgrade-workaround-in-tests.diff) which provides an alternative to the 3a patch for working around the downgrade issues.

          Instead of making the engine stop storing SPS plans in the database when running in soft upgrade potentially from one of the affected versions, this patch adds the workaround to the upgrade tests.

          The upgrade tests already have workarounds for the downgrade bug, but those workarounds are spread across multiple test cases. Also, this issue will change the format for more trigger plans, so that most triggers will hit the bug.

          The patch therefore removes the various workarounds for DERBY-4835, DERBY-5105, DERBY-5263 and DERBY-5289, and replaces them with one centralized workaround. By centralizing the workaround, it should be safe to add more trigger test cases to the upgrade tests later without worrying abut the downgrade issue.

          The new workaround adds a test case that runs last in the soft upgrade phase if the old version suffers from DERBY-4835 or DERBY-5289. The new test case calls a stored procedure that clears all SPS plans in SYS.SYSSTATEMENTS. This should make it safe to boot the database again in the post soft upgrade phase, as there won't be any stored plans that cause deserialization issues.

          I was hoping to use the new SYSCS_UTIL.SYSCS_INVALIDATE_STORED_STATEMENTS procedure for this. Unfortunately, it only invalidates the stored statements in memory, and leaves their disk representation untouched, so I had to let the test define its own procedure to do this. I also had to expose the clearSPSPlans() methods through the DataDictionary interface so that the test procedure could call it.

          This alternative patch avoids the potential performance degradation for users running soft upgrade from 10.8 or earlier, as it's worked around entirely in test code, with the exception of the addition of the one method to the DataDictionary interface. If there are no objections to the approach, I intend to go for this patch and drop the 3a patch.

          Show
          Knut Anders Hatlen added a comment - Attaching a new patch (d6003-3b-downgrade-workaround-in-tests.diff) which provides an alternative to the 3a patch for working around the downgrade issues. Instead of making the engine stop storing SPS plans in the database when running in soft upgrade potentially from one of the affected versions, this patch adds the workaround to the upgrade tests. The upgrade tests already have workarounds for the downgrade bug, but those workarounds are spread across multiple test cases. Also, this issue will change the format for more trigger plans, so that most triggers will hit the bug. The patch therefore removes the various workarounds for DERBY-4835 , DERBY-5105 , DERBY-5263 and DERBY-5289 , and replaces them with one centralized workaround. By centralizing the workaround, it should be safe to add more trigger test cases to the upgrade tests later without worrying abut the downgrade issue. The new workaround adds a test case that runs last in the soft upgrade phase if the old version suffers from DERBY-4835 or DERBY-5289 . The new test case calls a stored procedure that clears all SPS plans in SYS.SYSSTATEMENTS. This should make it safe to boot the database again in the post soft upgrade phase, as there won't be any stored plans that cause deserialization issues. I was hoping to use the new SYSCS_UTIL.SYSCS_INVALIDATE_STORED_STATEMENTS procedure for this. Unfortunately, it only invalidates the stored statements in memory, and leaves their disk representation untouched, so I had to let the test define its own procedure to do this. I also had to expose the clearSPSPlans() methods through the DataDictionary interface so that the test procedure could call it. This alternative patch avoids the potential performance degradation for users running soft upgrade from 10.8 or earlier, as it's worked around entirely in test code, with the exception of the addition of the one method to the DataDictionary interface. If there are no objections to the approach, I intend to go for this patch and drop the 3a patch.
          Knut Anders Hatlen made changes -
          Hide
          Knut Anders Hatlen added a comment -

          I wasn't quite happy with the approach in the 3b patch either, as I didn't like having to expose the clearSPSPlans() method in the DataDictionary interface just for the tests.

          So I went back and tried to find out why the new SYSCS_INVALIDATE_STORED_STATEMENTS procedure didn't do the trick. The procedure calls SPSDescriptor.makeInvalid() on all the SPSs in SYS.SYSSTATEMENTS. That method sets the VALID to false in the system table, but the actual plan is left there.

          The procedure could potentially help with more bugs if it also cleared the plan. The statement is going to be recompiled on the next execution anyways, so there's no value in keeping the plan in the system table.

          The attached patch d6003-3c-downgrade-with-stored-proc.diff implements that approach and makes SPSDescriptor.makeInvalid() change both the VALID column and the CONSTANTSTATE column (which holds the plan). It adds the same workaround to the upgrade tests as the 3b patch, only that it calls SYSCS_INVALIDATE_STORED_STATEMENT instead of DataDictionary.clearSPSPlans().

          The patch makes the upgrade tests run cleanly in combination with the 4a patch. It has the following advantages over the 3b patch:

          • No new interface methods in the DataDictionary
          • It makes SYSCS_INVALIDATE_STORED_STATEMENT useful for a larger class of bugs
          Show
          Knut Anders Hatlen added a comment - I wasn't quite happy with the approach in the 3b patch either, as I didn't like having to expose the clearSPSPlans() method in the DataDictionary interface just for the tests. So I went back and tried to find out why the new SYSCS_INVALIDATE_STORED_STATEMENTS procedure didn't do the trick. The procedure calls SPSDescriptor.makeInvalid() on all the SPSs in SYS.SYSSTATEMENTS. That method sets the VALID to false in the system table, but the actual plan is left there. The procedure could potentially help with more bugs if it also cleared the plan. The statement is going to be recompiled on the next execution anyways, so there's no value in keeping the plan in the system table. The attached patch d6003-3c-downgrade-with-stored-proc.diff implements that approach and makes SPSDescriptor.makeInvalid() change both the VALID column and the CONSTANTSTATE column (which holds the plan). It adds the same workaround to the upgrade tests as the 3b patch, only that it calls SYSCS_INVALIDATE_STORED_STATEMENT instead of DataDictionary.clearSPSPlans(). The patch makes the upgrade tests run cleanly in combination with the 4a patch. It has the following advantages over the 3b patch: No new interface methods in the DataDictionary It makes SYSCS_INVALIDATE_STORED_STATEMENT useful for a larger class of bugs
          Knut Anders Hatlen made changes -
          Hide
          Knut Anders Hatlen added a comment -

          I went ahead and committed the 3c patch with revision 1418296 and the 4a patch with revision 1418297.

          Show
          Knut Anders Hatlen added a comment - I went ahead and committed the 3c patch with revision 1418296 and the 4a patch with revision 1418297.
          Knut Anders Hatlen made changes -
          Issue & fix info Patch Available [ 10102 ]
          Hide
          Knut Anders Hatlen added a comment -

          Attaching d6003-5a-sort-vti-aggregate-window.diff which makes the
          result sets for distinct scans, VTIs, aggregations and window
          functions use ExecRowBuilder instead of generated methods.

          The code that creates the generated method is still there, since it's
          still needed by IndexToBaseRowNode. I plan to address that in the next
          patch.

          All the regression tests ran cleanly with the 5a patch.

          Details:

          • java/engine/org/apache/derby/iapi/sql/execute/ResultSetFactory.java
          • java/engine/org/apache/derby/impl/sql/execute/GenericResultSetFactory.java

          Signature changes to allow callers to pass a reference to a saved
          object (the ExecRowBuilder) instead of a generated method.

          • java/engine/org/apache/derby/impl/sql/compile/DistinctNode.java
          • java/engine/org/apache/derby/impl/sql/compile/FromVTI.java
          • java/engine/org/apache/derby/impl/sql/compile/GroupByNode.java
          • java/engine/org/apache/derby/impl/sql/compile/OrderByList.java
          • java/engine/org/apache/derby/impl/sql/compile/WindowResultSetNode.java

          Push a reference to an ExecRowBuilder as argument to the result set
          creation method instead of generating a row allocator method.

          • java/engine/org/apache/derby/impl/sql/execute/DistinctGroupedAggregateResultSet.java
          • java/engine/org/apache/derby/impl/sql/execute/DistinctScalarAggregateResultSet.java
          • java/engine/org/apache/derby/impl/sql/execute/GenericAggregateResultSet.java
          • java/engine/org/apache/derby/impl/sql/execute/GroupedAggregateResultSet.java
          • java/engine/org/apache/derby/impl/sql/execute/ScalarAggregateResultSet.java
          • java/engine/org/apache/derby/impl/sql/execute/SortResultSet.java
          • java/engine/org/apache/derby/impl/sql/execute/VTIResultSet.java
          • java/engine/org/apache/derby/impl/sql/execute/WindowResultSet.java

          Use the ExecRowBuilder to produce the row template instead of invoking
          the generated method.

          • java/engine/org/apache/derby/impl/sql/compile/ResultColumnList.java

          Remove the two-argument generateHolder() method, as it is no longer used.

          Show
          Knut Anders Hatlen added a comment - Attaching d6003-5a-sort-vti-aggregate-window.diff which makes the result sets for distinct scans, VTIs, aggregations and window functions use ExecRowBuilder instead of generated methods. The code that creates the generated method is still there, since it's still needed by IndexToBaseRowNode. I plan to address that in the next patch. All the regression tests ran cleanly with the 5a patch. Details: java/engine/org/apache/derby/iapi/sql/execute/ResultSetFactory.java java/engine/org/apache/derby/impl/sql/execute/GenericResultSetFactory.java Signature changes to allow callers to pass a reference to a saved object (the ExecRowBuilder) instead of a generated method. java/engine/org/apache/derby/impl/sql/compile/DistinctNode.java java/engine/org/apache/derby/impl/sql/compile/FromVTI.java java/engine/org/apache/derby/impl/sql/compile/GroupByNode.java java/engine/org/apache/derby/impl/sql/compile/OrderByList.java java/engine/org/apache/derby/impl/sql/compile/WindowResultSetNode.java Push a reference to an ExecRowBuilder as argument to the result set creation method instead of generating a row allocator method. java/engine/org/apache/derby/impl/sql/execute/DistinctGroupedAggregateResultSet.java java/engine/org/apache/derby/impl/sql/execute/DistinctScalarAggregateResultSet.java java/engine/org/apache/derby/impl/sql/execute/GenericAggregateResultSet.java java/engine/org/apache/derby/impl/sql/execute/GroupedAggregateResultSet.java java/engine/org/apache/derby/impl/sql/execute/ScalarAggregateResultSet.java java/engine/org/apache/derby/impl/sql/execute/SortResultSet.java java/engine/org/apache/derby/impl/sql/execute/VTIResultSet.java java/engine/org/apache/derby/impl/sql/execute/WindowResultSet.java Use the ExecRowBuilder to produce the row template instead of invoking the generated method. java/engine/org/apache/derby/impl/sql/compile/ResultColumnList.java Remove the two-argument generateHolder() method, as it is no longer used.
          Knut Anders Hatlen made changes -
          Attachment d6003-5a-sort-vti-aggregate-window.diff [ 12559856 ]
          Knut Anders Hatlen made changes -
          Issue & fix info Patch Available [ 10102 ]
          Hide
          Knut Anders Hatlen added a comment -

          Committed 5a to trunk with revision 1420579.

          Show
          Knut Anders Hatlen added a comment - Committed 5a to trunk with revision 1420579.
          Knut Anders Hatlen made changes -
          Issue & fix info Patch Available [ 10102 ]
          Hide
          Knut Anders Hatlen added a comment -

          Attaching d6003-6a-index-to-base-row.diff which makes IndexRowToBaseRowResultSet use an ExecRowBuilder instead of a generated method to create the row template used in the scan.

          This was the last remaining caller of ResultColumnList.generateHolder(), so the patch also removes the code that creates byte code for allocating row templates.

          IndexRowToBaseRowResultSet's invocation of generateHolder() used some extra logic to skip columns already fetched from the index. The patch merges this logic into the buildRowTemplate() method, which now takes an extra parameter to tell whether or not to skip certain columns.

          All regression tests ran cleanly with the patch.

          Patch details:

          • java/engine/org/apache/derby/impl/sql/compile/ResultColumnList.java

          Remove the generateHolder() and generateHolderMethod() methods.

          Move the missing logic for skipping index columns into the buildRowTemplate() method, and add extra parameter.

          Add shorthand variant of buildRowTemplate() that takes no arguments, since most callers don't have anything to pass in.

          • java/engine/org/apache/derby/impl/sql/compile/IndexToBaseRowNode.java

          Push a reference to an ExecRowBuilder instead of a generated method as argument to the result set creation method.

          • java/engine/org/apache/derby/impl/sql/execute/IndexRowToBaseRowResultSet.java

          Use the saved ExecRowBuilder instead of a generated method when creating the row template in the constructor.

          • java/engine/org/apache/derby/iapi/sql/execute/ResultSetFactory.java
          • java/engine/org/apache/derby/impl/sql/execute/GenericResultSetFactory.java

          Change method signatures to take saved object index instead of generated method.

          • java/engine/org/apache/derby/impl/sql/compile/DistinctNode.java
          • java/engine/org/apache/derby/impl/sql/compile/FromBaseTable.java
          • java/engine/org/apache/derby/impl/sql/compile/FromVTI.java
          • java/engine/org/apache/derby/impl/sql/compile/GroupByNode.java
          • java/engine/org/apache/derby/impl/sql/compile/OrderByList.java
          • java/engine/org/apache/derby/impl/sql/compile/WindowResultSetNode.java

          Update calls to ResultColumnList.buildRowTemplate() with the correct number of arguments after the signature change to allow skipping index columns.

          Show
          Knut Anders Hatlen added a comment - Attaching d6003-6a-index-to-base-row.diff which makes IndexRowToBaseRowResultSet use an ExecRowBuilder instead of a generated method to create the row template used in the scan. This was the last remaining caller of ResultColumnList.generateHolder(), so the patch also removes the code that creates byte code for allocating row templates. IndexRowToBaseRowResultSet's invocation of generateHolder() used some extra logic to skip columns already fetched from the index. The patch merges this logic into the buildRowTemplate() method, which now takes an extra parameter to tell whether or not to skip certain columns. All regression tests ran cleanly with the patch. Patch details: java/engine/org/apache/derby/impl/sql/compile/ResultColumnList.java Remove the generateHolder() and generateHolderMethod() methods. Move the missing logic for skipping index columns into the buildRowTemplate() method, and add extra parameter. Add shorthand variant of buildRowTemplate() that takes no arguments, since most callers don't have anything to pass in. java/engine/org/apache/derby/impl/sql/compile/IndexToBaseRowNode.java Push a reference to an ExecRowBuilder instead of a generated method as argument to the result set creation method. java/engine/org/apache/derby/impl/sql/execute/IndexRowToBaseRowResultSet.java Use the saved ExecRowBuilder instead of a generated method when creating the row template in the constructor. java/engine/org/apache/derby/iapi/sql/execute/ResultSetFactory.java java/engine/org/apache/derby/impl/sql/execute/GenericResultSetFactory.java Change method signatures to take saved object index instead of generated method. java/engine/org/apache/derby/impl/sql/compile/DistinctNode.java java/engine/org/apache/derby/impl/sql/compile/FromBaseTable.java java/engine/org/apache/derby/impl/sql/compile/FromVTI.java java/engine/org/apache/derby/impl/sql/compile/GroupByNode.java java/engine/org/apache/derby/impl/sql/compile/OrderByList.java java/engine/org/apache/derby/impl/sql/compile/WindowResultSetNode.java Update calls to ResultColumnList.buildRowTemplate() with the correct number of arguments after the signature change to allow skipping index columns.
          Knut Anders Hatlen made changes -
          Attachment d6003-6a-index-to-base-row.diff [ 12560761 ]
          Knut Anders Hatlen made changes -
          Issue & fix info Patch Available [ 10102 ]
          Hide
          Knut Anders Hatlen added a comment -

          Committed the 6a patch to trunk, revision 1426151.

          I haven't found any other places where we generate code for row templates, so I'm resolving the issue.

          Show
          Knut Anders Hatlen added a comment - Committed the 6a patch to trunk, revision 1426151. I haven't found any other places where we generate code for row templates, so I'm resolving the issue.
          Knut Anders Hatlen made changes -
          Status In Progress [ 3 ] Resolved [ 5 ]
          Issue & fix info Patch Available [ 10102 ]
          Fix Version/s 10.10.0.0 [ 12321550 ]
          Resolution Fixed [ 1 ]
          Hide
          Dag H. Wanvik added a comment -

          Thanks for working on reducing the amount of generated code, Knut! This is a tricky area and I haven't been able to follow this issue closely, but your approach seems good to me.

          Show
          Dag H. Wanvik added a comment - Thanks for working on reducing the amount of generated code, Knut! This is a tricky area and I haven't been able to follow this issue closely, but your approach seems good to me.
          Knut Anders Hatlen made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Gavin made changes -
          Workflow jira [ 12735495 ] Default workflow, editable Closed status [ 12802890 ]
          Knut Anders Hatlen made changes -
          Link This issue relates to DERBY-6314 [ DERBY-6314 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open In Progress In Progress
          4d 5h 50m 1 Knut Anders Hatlen 27/Nov/12 12:53
          In Progress In Progress Resolved Resolved
          29d 21h 11m 1 Knut Anders Hatlen 27/Dec/12 10:05
          Resolved Resolved Closed Closed
          171d 23h 21m 1 Knut Anders Hatlen 17/Jun/13 10:27

            People

            • Assignee:
              Knut Anders Hatlen
              Reporter:
              Knut Anders Hatlen
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development