Derby
  1. Derby
  2. DERBY-5947

Factor out common code from generated classes

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 10.10.1.1
    • Component/s: SQL
    • Labels:
      None
    • Urgency:
      Normal

      Description

      There's some code that's added to all classes generated by Derby's query compiler. For example, there are three static fields that contain statistics used to check if the plan is stale, and there are getter and setter methods for each of the three fields. The fields and their accessor methods take up 468 bytes in every generated class.

      We should see if we can factor out some of this code so that there is a single shared copy in BaseActivation. Advantages would be: less complicated byte-code generation, less memory occupied by generated classes in the statement cache, smaller disk footprint for stored prepared statements.

      1. d5947-5a-row-count-stats.diff
        31 kB
        Knut Anders Hatlen
      2. values1-after-4a.txt
        2 kB
        Knut Anders Hatlen
      3. d5947-4a-authorization.diff
        3 kB
        Knut Anders Hatlen
      4. natural-join-after-3a.txt
        3 kB
        Knut Anders Hatlen
      5. values1-after-3a.txt
        2 kB
        Knut Anders Hatlen
      6. d5947-3a-init-rs.diff
        19 kB
        Knut Anders Hatlen
      7. values1-after-2a.txt
        2 kB
        Knut Anders Hatlen
      8. d5947-2a-execute-method.diff
        10 kB
        Knut Anders Hatlen
      9. values1-after-1a.txt
        2 kB
        Knut Anders Hatlen
      10. d5947-1a-remove-common-methods.diff
        24 kB
        Knut Anders Hatlen
      11. natural-join-decompiled.txt
        5 kB
        Knut Anders Hatlen
      12. values1-decompiled.txt
        3 kB
        Knut Anders Hatlen

        Issue Links

          Activity

          Hide
          Knut Anders Hatlen added a comment -

          Committed revision 1413586. Resolving the issue again.

          Show
          Knut Anders Hatlen added a comment - Committed revision 1413586. Resolving the issue again.
          Hide
          Knut Anders Hatlen added a comment -

          Attaching d5947-5a-row-count-stats.diff, which moves the row count
          statistics from generated code to GenericPreparedStatement.

          It was trickier than I had expected to get the tests to run cleanly
          with the patch. I saw intermittent failures in the upgrade tests. They
          failed because they triggered an assert in store when booting the
          database for hard upgrade:

          Caused by: org.apache.derby.shared.common.sanity.AssertFailure: ASSERT FAILED initSlotTable consistency check failed: slot 0 minimumRecordSize = 12 totalSpace = 12 recordPortionLength = 8 reservedCount = 4
          at org.apache.derby.shared.common.sanity.SanityManager.THROWASSERT(SanityManager.java:162)
          at org.apache.derby.shared.common.sanity.SanityManager.THROWASSERT(SanityManager.java:147)
          at org.apache.derby.impl.store.raw.data.StoredPage.initSlotTable(StoredPage.java:2253)
          at org.apache.derby.impl.store.raw.data.StoredPage.initFromData(StoredPage.java:849)
          at org.apache.derby.impl.store.raw.data.CachedPage.setIdentity(CachedPage.java:213)
          at org.apache.derby.impl.services.cache.ConcurrentCache.find(ConcurrentCache.java:295)
          at org.apache.derby.impl.store.raw.data.FileContainer.getUserPage(FileContainer.java:2540)
          at org.apache.derby.impl.store.raw.data.FileContainer.getNextHeadPage(FileContainer.java:2858)
          at org.apache.derby.impl.store.raw.data.BaseContainer.getNextPage(BaseContainer.java:516)
          at org.apache.derby.impl.store.raw.data.BaseContainerHandle.getNextPage(BaseContainerHandle.java:364)
          at org.apache.derby.impl.store.access.conglomerate.GenericScanController.positionAtNextPage(GenericScanController.java:499)
          at org.apache.derby.impl.store.access.conglomerate.GenericScanController.fetchRows(GenericScanController.java:827)
          at org.apache.derby.impl.store.access.heap.HeapScan.fetchNext(HeapScan.java:238)
          at org.apache.derby.impl.sql.catalog.DataDictionaryImpl.clearSPSPlans(DataDictionaryImpl.java:4638)
          at org.apache.derby.impl.sql.catalog.DD_Version.handleMinorRevisionChange(DD_Version.java:555)
          at org.apache.derby.impl.sql.catalog.DD_Version.upgradeIfNeeded(DD_Version.java:250)
          (...)

          It only happened occasionally, and it looked like the order in which
          the test cases ran determined whether or not the test run succeeded. I
          managed to isolate it further and reliably reproduce it using the
          following steps:

          1) Create an empty database with Derby 10.4.2.0 or earlier

          2) Boot the database with trunk+patch (soft-upgrade) and create two
          tables and a trigger:

          create table a(x int);
          create table b(y int);
          create trigger t after update on b
          referencing new as n for each row
          insert into a values (n.y);

          3) Boot the database once with any Derby version 10.6.2.1 or earlier

          4) Boot the database again with trunk, and see it fail with the above
          mentioned assert failure

          The failing assert was added in DERBY-4577. And indeed it only
          reproduces if the downgrade step (3) is performed with a version that
          suffers from DERBY-4577.

          So what seems to be happening, is the following:

          When the trigger is created in step (2) with trunk+patch, the trigger
          plan is stored in a column in SYS.SYSSTATEMENTS. The plan is large
          enough to make the row overflow to another page. When downgrading the
          database in step (3), the old Derby version will go through all rows
          in SYS.SYSSTATEMENTS and set the column that holds compiled plans to
          NULL. Apparently, when shrinking the row, the old version runs into
          the condition that caused DERBY-4577, and ends up setting aside too
          little space for future expansion of the row. When subsequently
          booting the database again with trunk, the assert in StoredPage
          recognizes that this has happened, and fails.

          If a version that includes the fix for DERBY-4577 is used in step 3,
          the corruption doesn't happen, and trunk doesn't complain when booting
          the database later.

          Although I've only seen the assert failure with the patch, I don't
          think it's a problem introduced by the patch. It's just that the new
          and smaller trigger plans happen to make the SYSSTATEMENT rows
          produced by these test cases of the exact right size to hit
          DERBY-4577. Even without the patch, it might be possible to construct
          a SYSSTATEMENTS table that trips over the same problem.

          The patch works around this problem by forcing the problematic test
          cases to run in a known good order. I've run the full upgrade test
          suite successfully ten times with the workaround. Without the
          workaround it would fail approximately every other run (with Java 7,
          that is, where the test ordering varies between runs).

          Finally, a description of what the patch does:

          java/engine/org/apache/derby/iapi/reference/ClassName.java
          java/engine/org/apache/derby/iapi/services/compiler/ClassBuilder.java
          java/engine/org/apache/derby/iapi/services/compiler/MethodBuilder.java
          java/engine/org/apache/derby/impl/services/bytecode/BCClass.java
          java/engine/org/apache/derby/impl/services/bytecode/BCMethod.java
          java/engine/org/apache/derby/impl/sql/compile/ActivationClassBuilder.java
          java/engine/org/apache/derby/impl/sql/compile/ExpressionClassBuilder.java
          java/engine/org/apache/derby/impl/sql/compile/StatementNode.java

          Removed code added by the 1a patch for manipulating static fields and
          static initializers for the row count statistics.

          java/engine/org/apache/derby/impl/sql/GenericPreparedStatement.java

          Added fields to hold execution count, result set row count and stale
          plan interval for a statement.

          The fields were put in an inner class instead of directly inside
          GenericPreparedStatement. This was done because triggers clone the
          GenericPreparedStatement on each execution, so the statistics needed
          to be shared among multiple GPS instances. For all other statements
          than triggers, the statistics instance is private to one GPS instance.

          Also, since the lifespan of a GPS instance is longer than that of the
          generated activation class, code to reset the statistics on
          recompilation was added.

          org/apache/derby/iapi/sql/execute/ExecPreparedStatement.java

          Interface methods that allowed calls to the new methods in
          GenericPreparedStatement from BaseActivation.

          java/engine/org/apache/derby/impl/sql/execute/BaseActivation.java

          Removed inner class that held row count statistics.

          Made shouldWeCheckRowCounts() and informOfRowCount() save the
          statistics in the GenericPreparedStatement instead of the now removed
          inner class.

          java/engine/org/apache/derby/impl/sql/execute/ConstantActionActivation.java

          Removed a method that's no longer used.

          java/testing/org/apache/derbyTesting/functionTests/tests/upgradeTests/BasicSetup.java

          Made the tests run in a fixed order, for reasons explained above.

          Show
          Knut Anders Hatlen added a comment - Attaching d5947-5a-row-count-stats.diff, which moves the row count statistics from generated code to GenericPreparedStatement. It was trickier than I had expected to get the tests to run cleanly with the patch. I saw intermittent failures in the upgrade tests. They failed because they triggered an assert in store when booting the database for hard upgrade: Caused by: org.apache.derby.shared.common.sanity.AssertFailure: ASSERT FAILED initSlotTable consistency check failed: slot 0 minimumRecordSize = 12 totalSpace = 12 recordPortionLength = 8 reservedCount = 4 at org.apache.derby.shared.common.sanity.SanityManager.THROWASSERT(SanityManager.java:162) at org.apache.derby.shared.common.sanity.SanityManager.THROWASSERT(SanityManager.java:147) at org.apache.derby.impl.store.raw.data.StoredPage.initSlotTable(StoredPage.java:2253) at org.apache.derby.impl.store.raw.data.StoredPage.initFromData(StoredPage.java:849) at org.apache.derby.impl.store.raw.data.CachedPage.setIdentity(CachedPage.java:213) at org.apache.derby.impl.services.cache.ConcurrentCache.find(ConcurrentCache.java:295) at org.apache.derby.impl.store.raw.data.FileContainer.getUserPage(FileContainer.java:2540) at org.apache.derby.impl.store.raw.data.FileContainer.getNextHeadPage(FileContainer.java:2858) at org.apache.derby.impl.store.raw.data.BaseContainer.getNextPage(BaseContainer.java:516) at org.apache.derby.impl.store.raw.data.BaseContainerHandle.getNextPage(BaseContainerHandle.java:364) at org.apache.derby.impl.store.access.conglomerate.GenericScanController.positionAtNextPage(GenericScanController.java:499) at org.apache.derby.impl.store.access.conglomerate.GenericScanController.fetchRows(GenericScanController.java:827) at org.apache.derby.impl.store.access.heap.HeapScan.fetchNext(HeapScan.java:238) at org.apache.derby.impl.sql.catalog.DataDictionaryImpl.clearSPSPlans(DataDictionaryImpl.java:4638) at org.apache.derby.impl.sql.catalog.DD_Version.handleMinorRevisionChange(DD_Version.java:555) at org.apache.derby.impl.sql.catalog.DD_Version.upgradeIfNeeded(DD_Version.java:250) (...) It only happened occasionally, and it looked like the order in which the test cases ran determined whether or not the test run succeeded. I managed to isolate it further and reliably reproduce it using the following steps: 1) Create an empty database with Derby 10.4.2.0 or earlier 2) Boot the database with trunk+patch (soft-upgrade) and create two tables and a trigger: create table a(x int); create table b(y int); create trigger t after update on b referencing new as n for each row insert into a values (n.y); 3) Boot the database once with any Derby version 10.6.2.1 or earlier 4) Boot the database again with trunk, and see it fail with the above mentioned assert failure The failing assert was added in DERBY-4577 . And indeed it only reproduces if the downgrade step (3) is performed with a version that suffers from DERBY-4577 . So what seems to be happening, is the following: When the trigger is created in step (2) with trunk+patch, the trigger plan is stored in a column in SYS.SYSSTATEMENTS. The plan is large enough to make the row overflow to another page. When downgrading the database in step (3), the old Derby version will go through all rows in SYS.SYSSTATEMENTS and set the column that holds compiled plans to NULL. Apparently, when shrinking the row, the old version runs into the condition that caused DERBY-4577 , and ends up setting aside too little space for future expansion of the row. When subsequently booting the database again with trunk, the assert in StoredPage recognizes that this has happened, and fails. If a version that includes the fix for DERBY-4577 is used in step 3, the corruption doesn't happen, and trunk doesn't complain when booting the database later. Although I've only seen the assert failure with the patch, I don't think it's a problem introduced by the patch. It's just that the new and smaller trigger plans happen to make the SYSSTATEMENT rows produced by these test cases of the exact right size to hit DERBY-4577 . Even without the patch, it might be possible to construct a SYSSTATEMENTS table that trips over the same problem. The patch works around this problem by forcing the problematic test cases to run in a known good order. I've run the full upgrade test suite successfully ten times with the workaround. Without the workaround it would fail approximately every other run (with Java 7, that is, where the test ordering varies between runs). Finally, a description of what the patch does: java/engine/org/apache/derby/iapi/reference/ClassName.java java/engine/org/apache/derby/iapi/services/compiler/ClassBuilder.java java/engine/org/apache/derby/iapi/services/compiler/MethodBuilder.java java/engine/org/apache/derby/impl/services/bytecode/BCClass.java java/engine/org/apache/derby/impl/services/bytecode/BCMethod.java java/engine/org/apache/derby/impl/sql/compile/ActivationClassBuilder.java java/engine/org/apache/derby/impl/sql/compile/ExpressionClassBuilder.java java/engine/org/apache/derby/impl/sql/compile/StatementNode.java Removed code added by the 1a patch for manipulating static fields and static initializers for the row count statistics. java/engine/org/apache/derby/impl/sql/GenericPreparedStatement.java Added fields to hold execution count, result set row count and stale plan interval for a statement. The fields were put in an inner class instead of directly inside GenericPreparedStatement. This was done because triggers clone the GenericPreparedStatement on each execution, so the statistics needed to be shared among multiple GPS instances. For all other statements than triggers, the statistics instance is private to one GPS instance. Also, since the lifespan of a GPS instance is longer than that of the generated activation class, code to reset the statistics on recompilation was added. org/apache/derby/iapi/sql/execute/ExecPreparedStatement.java Interface methods that allowed calls to the new methods in GenericPreparedStatement from BaseActivation. java/engine/org/apache/derby/impl/sql/execute/BaseActivation.java Removed inner class that held row count statistics. Made shouldWeCheckRowCounts() and informOfRowCount() save the statistics in the GenericPreparedStatement instead of the now removed inner class. java/engine/org/apache/derby/impl/sql/execute/ConstantActionActivation.java Removed a method that's no longer used. java/testing/org/apache/derbyTesting/functionTests/tests/upgradeTests/BasicSetup.java Made the tests run in a fixed order, for reasons explained above.
          Hide
          Knut Anders Hatlen added a comment -

          The first patch for this issue reduced the amount of generated code needed for maintaining the execution count and row count information used to determine whether it's time to check if the plan is stale. But now it struck me that those statistics could probably be maintained in the GenericPreparedStatement instance that wraps the generated class. That is, outside of the generated code entirely. Reopening to investigate.

          Show
          Knut Anders Hatlen added a comment - The first patch for this issue reduced the amount of generated code needed for maintaining the execution count and row count information used to determine whether it's time to check if the plan is stale. But now it struck me that those statistics could probably be maintained in the GenericPreparedStatement instance that wraps the generated class. That is, outside of the generated code entirely. Reopening to investigate.
          Hide
          Knut Anders Hatlen added a comment -

          Committed the 3a and 4a patches (r1400023 and r1400024).

          I don't have any other changes planned for this issue, so I'm marking it as resolved.

          Show
          Knut Anders Hatlen added a comment - Committed the 3a and 4a patches (r1400023 and r1400024). I don't have any other changes planned for this issue, so I'm marking it as resolved.
          Hide
          Knut Anders Hatlen added a comment -

          The 4a patch moves the authorization check for cursor result sets from generated code to CursorActivation.

          This change removed 414 bytes from the generated class for VALUES 1. (The reduction was mainly class names and method names in the constant pool, not so many instructions were removed.) The total reduction in size for a VALUES 1 statement is now 878 bytes (from 3408 bytes to 2530 bytes).

          Attached is also the decompiled class for VALUES 1 after the 4a patch.

          All the regression tests ran cleanly with the patch.

          Show
          Knut Anders Hatlen added a comment - The 4a patch moves the authorization check for cursor result sets from generated code to CursorActivation. This change removed 414 bytes from the generated class for VALUES 1. (The reduction was mainly class names and method names in the constant pool, not so many instructions were removed.) The total reduction in size for a VALUES 1 statement is now 878 bytes (from 3408 bytes to 2530 bytes). Attached is also the decompiled class for VALUES 1 after the 4a patch. All the regression tests ran cleanly with the patch.
          Hide
          Knut Anders Hatlen added a comment -

          Attaching d5947-3a-init-rs.diff, which moves the initialization of
          BaseActivation's resultSet field from the generated code to
          BaseActivation.execute().

          The only logic that's left in the generated doExecute() method is code
          that reinitializes the generated data-structures before each
          execution. Because of this, I renamed the doExecute() method to
          reinit().

          BaseActivation is given an empty reinit() method so that those
          generated classes that don't need to do any extra reinitialization,
          don't need to lay out code for an empty method.

          Also, since BaseActivation.execute() now needs to call the generated
          method that creates the result set tree, an abstract method called
          createResultSet() has been added to BaseActivation. This method is
          implemented by all generated classes. It's the same method as the
          previously private fillResultSet() method, only that it has been
          renamed and made visible from BaseActivation.

          With these changes, the amount of code generated for a VALUES 1
          statement is reduced by yet another 234 bytes (total reduction: 464
          bytes). A VALUES 1 statement doesn't have to generate a reinit()
          method and benefits from the empty default method in BaseActivation. A
          statement that generates a reinit() method, like the join whose
          generated code I've attached before, the reduction will be a little
          smaller; 188 bytes (total: 418 bytes).

          As before, I'm attaching values1-after-3a.txt to show what the
          generated class for a VALUES 1 statement looks like now. I'm also
          attaching natural-join-after-3a.txt to show what the new reinit()
          method looks like.

          All the regression tests ran cleanly with the patch.

          Patch details:

          ActivationClassBuilder:

          • Removed generation of code that now lives in BaseActivation.execute().
          • In finishExecuteMethod(): Only complete the method if code has been
            added to it (some simple statements don't add code to it anymore),
            so that we don't have to generate an empty method.
          • In getCurrentSetup(): Use getter method instead of accessing the
            method builder directly, since the method builder is generated
            lazily now and the field could therefore be null.

          ExpressionClassBuilder:

          • Generate the reinit() method lazily to prevent generation of the
            method if no code is added to it.

          StatementNode:

          • Removed generation of code that now lives in BaseActivation.execute().
          • Changed name and visibility for the fillResultSet() method so that
            it matches the abstract method in BaseActivation.

          BaseActivation:

          • Added logic that used to be in the generated class. Note: In the
            generated classes, markAsTopResultSet() used to be called on each
            execution, but the new code only calls it once, when the result set
            is created. Since markAsTopResultSet() just sets a flag to true, and
            there is no code that ever changes it back to false, calling it once
            per result set should be enough.

          DMLModStatementNode:
          DeleteNode:
          InsertNode:
          UpdateNode:

          • Stop calling getExecuteMethod() unless the method builder is
            actually going to be used. Otherwise, a reinit() method will be
            created for all INSERT, UPDATE and DELETE statement, even if no code
            is added to it and they could use the default implementation in
            BaseActivation.
          Show
          Knut Anders Hatlen added a comment - Attaching d5947-3a-init-rs.diff, which moves the initialization of BaseActivation's resultSet field from the generated code to BaseActivation.execute(). The only logic that's left in the generated doExecute() method is code that reinitializes the generated data-structures before each execution. Because of this, I renamed the doExecute() method to reinit(). BaseActivation is given an empty reinit() method so that those generated classes that don't need to do any extra reinitialization, don't need to lay out code for an empty method. Also, since BaseActivation.execute() now needs to call the generated method that creates the result set tree, an abstract method called createResultSet() has been added to BaseActivation. This method is implemented by all generated classes. It's the same method as the previously private fillResultSet() method, only that it has been renamed and made visible from BaseActivation. With these changes, the amount of code generated for a VALUES 1 statement is reduced by yet another 234 bytes (total reduction: 464 bytes). A VALUES 1 statement doesn't have to generate a reinit() method and benefits from the empty default method in BaseActivation. A statement that generates a reinit() method, like the join whose generated code I've attached before, the reduction will be a little smaller; 188 bytes (total: 418 bytes). As before, I'm attaching values1-after-3a.txt to show what the generated class for a VALUES 1 statement looks like now. I'm also attaching natural-join-after-3a.txt to show what the new reinit() method looks like. All the regression tests ran cleanly with the patch. Patch details: ActivationClassBuilder: Removed generation of code that now lives in BaseActivation.execute(). In finishExecuteMethod(): Only complete the method if code has been added to it (some simple statements don't add code to it anymore), so that we don't have to generate an empty method. In getCurrentSetup(): Use getter method instead of accessing the method builder directly, since the method builder is generated lazily now and the field could therefore be null. ExpressionClassBuilder: Generate the reinit() method lazily to prevent generation of the method if no code is added to it. StatementNode: Removed generation of code that now lives in BaseActivation.execute(). Changed name and visibility for the fillResultSet() method so that it matches the abstract method in BaseActivation. BaseActivation: Added logic that used to be in the generated class. Note: In the generated classes, markAsTopResultSet() used to be called on each execution, but the new code only calls it once, when the result set is created. Since markAsTopResultSet() just sets a flag to true, and there is no code that ever changes it back to false, calling it once per result set should be enough. DMLModStatementNode: DeleteNode: InsertNode: UpdateNode: Stop calling getExecuteMethod() unless the method builder is actually going to be used. Otherwise, a reinit() method will be created for all INSERT, UPDATE and DELETE statement, even if no code is added to it and they could use the default implementation in BaseActivation.
          Hide
          Knut Anders Hatlen added a comment -

          Thanks, Dag. I've committed the 1a and 2a patches to trunk (r1399139 and r1399140).

          I think there is still some code that could be factored out from the generated doExecute() method. Initialization of the BaseActivation.resultSet field is one thing. Maybe the authorization checks could be moved from byte code to Java code as well. I'll give it a try.

          Show
          Knut Anders Hatlen added a comment - Thanks, Dag. I've committed the 1a and 2a patches to trunk (r1399139 and r1399140). I think there is still some code that could be factored out from the generated doExecute() method. Initialization of the BaseActivation.resultSet field is one thing. Maybe the authorization checks could be moved from byte code to Java code as well. I'll give it a try.
          Hide
          Dag H. Wanvik added a comment -

          This looks like good changes to me. Nice to be able to lift that shared code up into BaseActivation.
          The less logic there remains in the generated code, the better. Good to tweak away superfluous bytes in the generated code, too, should help the JIT +1

          Show
          Dag H. Wanvik added a comment - This looks like good changes to me. Nice to be able to lift that shared code up into BaseActivation. The less logic there remains in the generated code, the better. Good to tweak away superfluous bytes in the generated code, too, should help the JIT +1
          Hide
          Knut Anders Hatlen added a comment -

          Attaching d5947-2a-execute-method.diff, which builds on top of the 1a
          patch by factoring out some common code from the execute() methods of
          the generated classes.

          It also does a little massaging of the code that's still generated for
          the execute() method to make it more compact.

          The 2a patch reduces the size of each generated class by 91 bytes, or
          230 bytes in total when counting the additional reduction we get from
          the 1a patch.

          The file values1-after-2a.txt shows what a generated class for a
          VALUES 1 statement looks like when the patch is applied.

          Before the patch, there would be an execute() method like this one:

              public ResultSet execute() throws StandardException {
          	this.throwIfClosed("execute");
          	this.startExecution();
          	ResultSet resultset;
          	if (resultSet == null) {
          	    ResultSet resultset_3_ = fillResultSet();
          	    ace50d80a4x013ax6409x6343x000003196d400 var_ace50d80a4x013ax6409x6343x000003196d400_4_
          		= this;
          	    resultset
          		= var_ace50d80a4x013ax6409x6343x000003196d400_4_.resultSet
          		= resultset_3_;
          	} else
          	    resultset = resultSet;
          	((NoPutResultSet) resultset).markAsTopResultSet();
          	return resultset;
              }
          

          After the patch, there is a shorter doExecute() method instead of the
          original execute() method:

              protected ResultSet doExecute() throws StandardException {
          	ResultSet resultset = resultSet;
          	if (resultset == null) {
          	    ace50d80a4x013ax64d6xddb7x00000319bb880 var_ace50d80a4x013ax64d6xddb7x00000319bb880_3_
          		= this;
          	    resultset
          		= var_ace50d80a4x013ax64d6xddb7x00000319bb880_3_.resultSet
          		= var_ace50d80a4x013ax64d6xddb7x00000319bb880_3_
          		      .fillResultSet();
          	}
          	((NoPutResultSet) resultset).markAsTopResultSet();
          	return resultset;
              }
          

          The first two calls have been taken out of the generated method and
          have been moved to a shared execute() method in the super class. The
          shared method calls the generated doExecute() method.

          Additionally, the code in the generated method has been massaged so
          that it only looks up the value of the resultSet field once. That is,
          there is no longer an else branch that does a second look-up.

          All the regression tests ran cleanly with the patch. (Ran the full
          suite on JDK 7, and also the lang suite on OJEC to verify that the new
          layout of the generated classes didn't cause any class format problems
          on the small platforms.)

          Patch details:

          BaseActivation:

          • Added new abstract method, doExecute(), to be implemented by the
            generated classes.
          • Added implementation of Activation.execute() which contained the
            code that was previously added to every generated class. Finally, it
            calls the abstract doExecute() method to allow the generated classes
            to add logic to it.
          • Moved the body of the startExecution() method into execute() and
            removed it, since it's so small and there are no other callers of it
            after the changes in this patch.

          ConstantActionActivation:

          • Renamed the execute() method to doExecute() and removed the code
            that's identical to the code that now lives in BaseActivation's
            execute() method.

          ActivationClassBuilder:

          • Renamed the generated execute() method to doExecute().
          • Removed the code that generated the logic that now lives in
            BaseActivation.execute().

          StatementNode:

          • Updated the generated code to keep the value of the resultSet field
            on the stack to avoid a second look-up when it's non-null.
          • Updated the generated code to push the instance for the getField()
            operation earlier to avoid the need to swap operands.
          Show
          Knut Anders Hatlen added a comment - Attaching d5947-2a-execute-method.diff, which builds on top of the 1a patch by factoring out some common code from the execute() methods of the generated classes. It also does a little massaging of the code that's still generated for the execute() method to make it more compact. The 2a patch reduces the size of each generated class by 91 bytes, or 230 bytes in total when counting the additional reduction we get from the 1a patch. The file values1-after-2a.txt shows what a generated class for a VALUES 1 statement looks like when the patch is applied. Before the patch, there would be an execute() method like this one: public ResultSet execute() throws StandardException { this .throwIfClosed( "execute" ); this .startExecution(); ResultSet resultset; if (resultSet == null ) { ResultSet resultset_3_ = fillResultSet(); ace50d80a4x013ax6409x6343x000003196d400 var_ace50d80a4x013ax6409x6343x000003196d400_4_ = this ; resultset = var_ace50d80a4x013ax6409x6343x000003196d400_4_.resultSet = resultset_3_; } else resultset = resultSet; ((NoPutResultSet) resultset).markAsTopResultSet(); return resultset; } After the patch, there is a shorter doExecute() method instead of the original execute() method: protected ResultSet doExecute() throws StandardException { ResultSet resultset = resultSet; if (resultset == null ) { ace50d80a4x013ax64d6xddb7x00000319bb880 var_ace50d80a4x013ax64d6xddb7x00000319bb880_3_ = this ; resultset = var_ace50d80a4x013ax64d6xddb7x00000319bb880_3_.resultSet = var_ace50d80a4x013ax64d6xddb7x00000319bb880_3_ .fillResultSet(); } ((NoPutResultSet) resultset).markAsTopResultSet(); return resultset; } The first two calls have been taken out of the generated method and have been moved to a shared execute() method in the super class. The shared method calls the generated doExecute() method. Additionally, the code in the generated method has been massaged so that it only looks up the value of the resultSet field once. That is, there is no longer an else branch that does a second look-up. All the regression tests ran cleanly with the patch. (Ran the full suite on JDK 7, and also the lang suite on OJEC to verify that the new layout of the generated classes didn't cause any class format problems on the small platforms.) Patch details: BaseActivation: Added new abstract method, doExecute(), to be implemented by the generated classes. Added implementation of Activation.execute() which contained the code that was previously added to every generated class. Finally, it calls the abstract doExecute() method to allow the generated classes to add logic to it. Moved the body of the startExecution() method into execute() and removed it, since it's so small and there are no other callers of it after the changes in this patch. ConstantActionActivation: Renamed the execute() method to doExecute() and removed the code that's identical to the code that now lives in BaseActivation's execute() method. ActivationClassBuilder: Renamed the generated execute() method to doExecute(). Removed the code that generated the logic that now lives in BaseActivation.execute(). StatementNode: Updated the generated code to keep the value of the resultSet field on the stack to avoid a second look-up when it's non-null. Updated the generated code to push the instance for the getField() operation earlier to avoid the need to swap operands.
          Hide
          Knut Anders Hatlen added a comment -

          Attached is a patch (d5947-1a-remove-common-methods.diff) that reduces
          the number of fields and methods in the generated classes. I'm also
          attaching values1-after-1a.txt which shows the decompiled class file
          for a VALUES 1 statement, for comparison with the already attached
          values1-decompiled.txt.

          As can be seen from the decompiled class file, the patch removes the
          three fields getExecutionCount, getRowCountCheckVector and
          getStalePlanCheckInterval, and replaces them with a single constant
          field of type RowCountStats.

          The six accessor methods have been replaced with a single getter
          method which returns the RowCountStats instance.

          For those who care about bits and bytes, the size of each generated
          class was reduced by 139 bytes.

          All the regression tests passed with the patch.

          Patch details:

          java/engine/org/apache/derby/impl/sql/execute/BaseActivation.java:

          • Added an inner class RowCountStats with three fields that could hold
            the same information as the three fields that previously were added
            to each generated class.
          • Added an abstract method for getting the RowCountStats instance. To
            be implemented by the generated classes.
          • Removed six abstract methods that are no longer implemented by the
            generated classes.

          java/engine/org/apache/derby/impl/sql/execute/ConstantActionActivation.java:

          • Removed the six accessor methods (the same ones that are removed
            from the generated classes).
          • Added overrides of two methods that tell the execution code that row
            counts are of no relevance to constant actions.

          java/engine/org/apache/derby/impl/sql/compile/ActivationClassBuilder.java:

          • Removed code that would add the three fields and their accessor
            methods to the generated classes.
          • Added code to create the new static field, a static initializer for
            it, and a getter method.

          java/engine/org/apache/derby/iapi/reference/ClassName.java:

          • Added a constant for the name of the new RowCountStats class.

          java/engine/org/apache/derby/iapi/services/compiler/ClassBuilder.java
          java/engine/org/apache/derby/iapi/services/compiler/MethodBuilder.java
          java/engine/org/apache/derby/impl/services/bytecode/BCClass.java
          java/engine/org/apache/derby/impl/services/bytecode/BCMethod.java
          java/engine/org/apache/derby/impl/sql/compile/ExpressionClassBuilder.java
          java/engine/org/apache/derby/impl/sql/compile/StatementNode.java

          • Removed the old helper methods used for generating fields and their
            corresponding accessor methods, as they are no longer used.
          • Added new helper methods to generate static initializers and code to
            get or set static fields.
          Show
          Knut Anders Hatlen added a comment - Attached is a patch (d5947-1a-remove-common-methods.diff) that reduces the number of fields and methods in the generated classes. I'm also attaching values1-after-1a.txt which shows the decompiled class file for a VALUES 1 statement, for comparison with the already attached values1-decompiled.txt. As can be seen from the decompiled class file, the patch removes the three fields getExecutionCount, getRowCountCheckVector and getStalePlanCheckInterval, and replaces them with a single constant field of type RowCountStats. The six accessor methods have been replaced with a single getter method which returns the RowCountStats instance. For those who care about bits and bytes, the size of each generated class was reduced by 139 bytes. All the regression tests passed with the patch. Patch details: java/engine/org/apache/derby/impl/sql/execute/BaseActivation.java: Added an inner class RowCountStats with three fields that could hold the same information as the three fields that previously were added to each generated class. Added an abstract method for getting the RowCountStats instance. To be implemented by the generated classes. Removed six abstract methods that are no longer implemented by the generated classes. java/engine/org/apache/derby/impl/sql/execute/ConstantActionActivation.java: Removed the six accessor methods (the same ones that are removed from the generated classes). Added overrides of two methods that tell the execution code that row counts are of no relevance to constant actions. java/engine/org/apache/derby/impl/sql/compile/ActivationClassBuilder.java: Removed code that would add the three fields and their accessor methods to the generated classes. Added code to create the new static field, a static initializer for it, and a getter method. java/engine/org/apache/derby/iapi/reference/ClassName.java: Added a constant for the name of the new RowCountStats class. java/engine/org/apache/derby/iapi/services/compiler/ClassBuilder.java java/engine/org/apache/derby/iapi/services/compiler/MethodBuilder.java java/engine/org/apache/derby/impl/services/bytecode/BCClass.java java/engine/org/apache/derby/impl/services/bytecode/BCMethod.java java/engine/org/apache/derby/impl/sql/compile/ExpressionClassBuilder.java java/engine/org/apache/derby/impl/sql/compile/StatementNode.java Removed the old helper methods used for generating fields and their corresponding accessor methods, as they are no longer used. Added new helper methods to generate static initializers and code to get or set static fields.
          Hide
          Knut Anders Hatlen added a comment -

          Attaching files that show the generated code for a VALUES 1 statement and for SELECT SCHEMANAME, TABLENAME FROM SYS.SYSSCHEMAS NATURAL JOIN SYS.SYSTABLES.

          They are decompiled to Java code for readability. The decompiler got the base class wrong, it should be BaseActivation instead of CursorActivation, but other than that it did a good job.

          As can be seen from those files, the following fields are in both classes:

          private static int getExecutionCount;
          private static Vector getRowCountCheckVector;
          private static int getStalePlanCheckInterval;

          And so are their six accessor methods.

          Additionally, we could probably factor out

          this.throwIfClosed("execute");
          this.startExecution();

          from their execute() methods (code which is also duplicated in ConstantActionActivation.execute()).

          Show
          Knut Anders Hatlen added a comment - Attaching files that show the generated code for a VALUES 1 statement and for SELECT SCHEMANAME, TABLENAME FROM SYS.SYSSCHEMAS NATURAL JOIN SYS.SYSTABLES. They are decompiled to Java code for readability. The decompiler got the base class wrong, it should be BaseActivation instead of CursorActivation, but other than that it did a good job. As can be seen from those files, the following fields are in both classes: private static int getExecutionCount; private static Vector getRowCountCheckVector; private static int getStalePlanCheckInterval; And so are their six accessor methods. Additionally, we could probably factor out this.throwIfClosed("execute"); this.startExecution(); from their execute() methods (code which is also duplicated in ConstantActionActivation.execute()).
          Hide
          Knut Anders Hatlen added a comment -

          I think it's true that reflection is a lot cheaper now than it was in the early days of Cloudscape. And in Java 7 there are method handles[1], which are supposed to be even faster (as they are strongly typed). There are also library methods [2] that let you chain method handles together and produce new ones that represent compound expressions like if/else, try/catch, manipulation of arguments and return values, etc. So Bryan's ideal future may not be that far away.

          My biggest problem with the byte code generation is that loading the generated class significantly slows down the SQL compiler. I recently ran an experiment where I replaced the use of org.apache.derby.impl.services.reflect.ReflectLoaderJava2 with the OpenJDK-internal class sun.invoke.anon.AnonymousClassLoader (described in [3] and [4]), which would load the generated classes without registering it in the JVM's dictionary. This change made compilation of a VALUES 1 statement go more than three times faster, and compilation of a natural join between two system tables went almost twice as fast. It also helped the regression tests, as they do a lot of compilation. So if we could find a portable way to get rid of the code generation/class loading without hurting the runtime performance, I'd be all for it.

          I don't think trimming down the size of the generated classes will help the compilation performance much, as we still need to load as many classes as before, but hopefully reducing the amount of generated code now will make it easier to eliminate it completely when we're ready to make that move.

          [1] http://docs.oracle.com/javase/7/docs/api/java/lang/invoke/MethodHandle.html
          [2] http://docs.oracle.com/javase/7/docs/api/java/lang/invoke/MethodHandles.html
          [3] https://blogs.oracle.com/jrose/entry/anonymous_classes_in_the_vm
          [4] http://blog.headius.com/2008/09/first-taste-of-invokedynamic.html

          Show
          Knut Anders Hatlen added a comment - I think it's true that reflection is a lot cheaper now than it was in the early days of Cloudscape. And in Java 7 there are method handles [1] , which are supposed to be even faster (as they are strongly typed). There are also library methods [2] that let you chain method handles together and produce new ones that represent compound expressions like if/else, try/catch, manipulation of arguments and return values, etc. So Bryan's ideal future may not be that far away. My biggest problem with the byte code generation is that loading the generated class significantly slows down the SQL compiler. I recently ran an experiment where I replaced the use of org.apache.derby.impl.services.reflect.ReflectLoaderJava2 with the OpenJDK-internal class sun.invoke.anon.AnonymousClassLoader (described in [3] and [4] ), which would load the generated classes without registering it in the JVM's dictionary. This change made compilation of a VALUES 1 statement go more than three times faster, and compilation of a natural join between two system tables went almost twice as fast. It also helped the regression tests, as they do a lot of compilation. So if we could find a portable way to get rid of the code generation/class loading without hurting the runtime performance, I'd be all for it. I don't think trimming down the size of the generated classes will help the compilation performance much, as we still need to load as many classes as before, but hopefully reducing the amount of generated code now will make it easier to eliminate it completely when we're ready to make that move. [1] http://docs.oracle.com/javase/7/docs/api/java/lang/invoke/MethodHandle.html [2] http://docs.oracle.com/javase/7/docs/api/java/lang/invoke/MethodHandles.html [3] https://blogs.oracle.com/jrose/entry/anonymous_classes_in_the_vm [4] http://blog.headius.com/2008/09/first-taste-of-invokedynamic.html
          Hide
          Rick Hillegas added a comment -

          I agree with Bryan that we should eliminate as much byte code generation as possible. In the early days of Cloudscape, we were generating a lot more byte code for DDL statements--for no good reason. That is why the ConstantActions exist: to eliminate the need to generate obscure byte code for operations which don't need it.

          I think the major value of byte code generation is that it eliminates the cost of using reflection to call user-written code for functions, procedures, types, and (soon) aggregates. Maybe reflection has become tolerably cheap on modern JVMs and we could reconsider the need for invoking user-written logic via generated byte code.

          I think there are other operations for which we generate byte code in order to eliminate reflection.

          Show
          Rick Hillegas added a comment - I agree with Bryan that we should eliminate as much byte code generation as possible. In the early days of Cloudscape, we were generating a lot more byte code for DDL statements--for no good reason. That is why the ConstantActions exist: to eliminate the need to generate obscure byte code for operations which don't need it. I think the major value of byte code generation is that it eliminates the cost of using reflection to call user-written code for functions, procedures, types, and (soon) aggregates. Maybe reflection has become tolerably cheap on modern JVMs and we could reconsider the need for invoking user-written logic via generated byte code. I think there are other operations for which we generate byte code in order to eliminate reflection.
          Hide
          Bryan Pendleton added a comment -

          Byte code generation is one of the most arcane parts of Derby; anything that can reduce
          or simplify the byte code gets a big +1 from me.

          In some ideal future, I'd love to see a version of Derby where there was no byte code
          generation at all; either because we'd eliminated it entirely (in favor of an interpreter?),
          or because it was at least an option and you could choose to execute your queries
          some other way.

          In addition to making it easier for casual users to comprehend the details of Derby
          query execution, this would also, I think, simplify some of our security policies, and
          possibly make it so that we could run Derby in other Java environments, such as Android.

          Show
          Bryan Pendleton added a comment - Byte code generation is one of the most arcane parts of Derby; anything that can reduce or simplify the byte code gets a big +1 from me. In some ideal future, I'd love to see a version of Derby where there was no byte code generation at all; either because we'd eliminated it entirely (in favor of an interpreter?), or because it was at least an option and you could choose to execute your queries some other way. In addition to making it easier for casual users to comprehend the details of Derby query execution, this would also, I think, simplify some of our security policies, and possibly make it so that we could run Derby in other Java environments, such as Android.

            People

            • Assignee:
              Knut Anders Hatlen
              Reporter:
              Knut Anders Hatlen
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development