Derby
  1. Derby
  2. DERBY-6542

Improve the concurrency of identity columns by using SYS.SYSSEQUENCES

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.11.1.1
    • Fix Version/s: 10.11.1.1
    • Component/s: SQL
    • Labels:
      None
    • Urgency:
      Normal
    • Issue & fix info:
      Release Note Needed
    • Bug behavior facts:
      Performance

      Description

      This is another attempt to improve the concurrency of identity columns. The previous attempt was tracked by DERBY-4437.

      This new attempt will try out Mike's last suggestion: use system-created sequences managed by SYS.SYSSEQUENCES. This should reduce the contention on the core catalogs.

      I'm hopeful about this approach because of the experiments tracked by DERBY-6533. There we are not seeing any problems related to sequence generators but we are seeing lots of identity-related lock timeouts.

      Here is the general shape of this approach:

      1) When adding an identity column to a table, Derby will create a sequence generator for the column. The sequence generator will live in the SYS schema and its name will be the table's UUID.

      2) DROP SEQUENCE will not operate on system-created sequences. System-created sequences will be dropped/modified by DROP/ALTER TABLE commands.

      3) We will add a new system function for inspecting the current, in-memory value of an identity generator without getting a lock on SYS.SYSSEQUENCES: SYSCS_UTIL.SYSCS_PEEK_AT_IDENTITY( tableSchemaName, tableName ).

      4) Derby will create a sequence for every legacy identity column after hard-upgrade.

      5) These changes will take place only after hard-upgrade. Soft-upgrade will not change the behavior of identity columns.

      Comments on this proposal are welcome. Thanks.

      1. releaseNote.html
        3 kB
        Rick Hillegas
      2. releaseNote.html
        4 kB
        Rick Hillegas
      3. derby-6542-03-aa-lockTimeoutForIdentityContention.diff
        3 kB
        Rick Hillegas
      4. derby-6542-02-af-useNewSequenceGenerator.diff
        73 kB
        Rick Hillegas
      5. derby-6542-02-ac-useNewSequenceGenerator.diff
        42 kB
        Rick Hillegas
      6. derby-6542-02-ab-useNewSequenceGenerator.diff
        42 kB
        Rick Hillegas
      7. derby-6542-01-ab-catalog.diff
        44 kB
        Rick Hillegas

        Issue Links

          Activity

          Hide
          Mike Matrigali added a comment -

          it has been so many years since this discussion. Can you remind me what the
          architectural difference is between sequences and identity columns currently. Or
          if it is written up in one place point me there. It is hard working through the
          various old issues to tell what still applies or not after all the fixes have gone in.

          Scanning the old topics on this issue there seemed to be the following issues, could you comment
          how your proposal affects identity column behavior for these:
          1) correctness issue DERBY-5493 (i think the answer is that sequences behave better than identity)
          2) is there a worst case where the change will reduce current identity performance if sequences are used,
          likely some case with only preallocation=1. There was some discussion about performance difference of identity and sequences,
          could you note if there is any downside to switching to sequences. I think the
          areas of concern were performance with preallocation 1 being the worst case. There
          were changes both in sequences and identity columns since that discussion so not
          clear what the current state is.
          3) is there a behavior change for "lost" values by changing to sequences?

          Show
          Mike Matrigali added a comment - it has been so many years since this discussion. Can you remind me what the architectural difference is between sequences and identity columns currently. Or if it is written up in one place point me there. It is hard working through the various old issues to tell what still applies or not after all the fixes have gone in. Scanning the old topics on this issue there seemed to be the following issues, could you comment how your proposal affects identity column behavior for these: 1) correctness issue DERBY-5493 (i think the answer is that sequences behave better than identity) 2) is there a worst case where the change will reduce current identity performance if sequences are used, likely some case with only preallocation=1. There was some discussion about performance difference of identity and sequences, could you note if there is any downside to switching to sequences. I think the areas of concern were performance with preallocation 1 being the worst case. There were changes both in sequences and identity columns since that discussion so not clear what the current state is. 3) is there a behavior change for "lost" values by changing to sequences?
          Hide
          Rick Hillegas added a comment -

          Thanks for raising these issues, Mike. Some responses follow:

          MM> it has been so many years since this discussion. Can you remind me what the
          MM> architectural difference is between sequences and identity columns currently. Or
          MM> if it is written up in one place point me there. It is hard working through the
          MM> various old issues to tell what still applies or not after all the
          MM> fixes have gone in.

          The behavior of sequence generators is described by the header comment in SequenceUpdater. The key differences in behavior between sequences and identity columns are these:

          i) A sequence generator pre-allocates a chunk of new values instead of updating a catalog row every time a new value is requested. During orderly shutdown, the unused values are flushed and the catalog row is updated so that shutdown won't leak chunks of unused values.

          ii) The sequence generator always updates the catalog in a subtransaction of the user's current execution transaction, expecting to get the lock immediately. The write is committed immediately. If the update fails, then the generator raises a TOO_MUCH_CONTENTION exception. It is believed that this can only happen if someone is scanning SYS.SYSSEQUENCES directly, rather than using the SYSCS_UTIL.SYSCS_PEEK_AT_SEQUENCE function. In contrast, identity columns try to use a subtransaction, but if that doesn't work, they escalate to using the parent transaction. There is substantial lock contention because the volatile identity counter is maintained in a core catalog.

          MM> Scanning the old topics on this issue there seemed to be the following issues, could you comment
          MM> how your proposal affects identity column behavior for these:
          MM> 1) correctness issue DERBY-5493 (i think the answer is that sequences behave better than identity)

          It appears to me that DERBY-5493 and DERBY-5494 were both fixed. I am not aware of any outstanding correctness problems with sequence generators.

          MM> 2) is there a worst case where the change will reduce current identity performance if sequences are used,
          MM> likely some case with only preallocation=1. There was some discussion about performance difference of identity and sequences,
          MM> could you note if there is any downside to switching to sequences. I think the
          MM> areas of concern were performance with preallocation 1 being the worst case. There
          MM> were changes both in sequences and identity columns since that discussion so not
          MM> clear what the current state is.

          I don't think we've measured this. There's certainly a possibility that a degenerate preallocation range would result in reduced performance. I think that the big downside is the possibility of leaking chunks of unused identity values if the database crashes or is brought down ungracefully.

          MM> 3) is there a behavior change for "lost" values by changing to sequences?

          Other than the leakage mentioned above, I'm not aware of any behavior change for lost values.

          Thanks,
          -Rick

          Show
          Rick Hillegas added a comment - Thanks for raising these issues, Mike. Some responses follow: MM> it has been so many years since this discussion. Can you remind me what the MM> architectural difference is between sequences and identity columns currently. Or MM> if it is written up in one place point me there. It is hard working through the MM> various old issues to tell what still applies or not after all the MM> fixes have gone in. The behavior of sequence generators is described by the header comment in SequenceUpdater. The key differences in behavior between sequences and identity columns are these: i) A sequence generator pre-allocates a chunk of new values instead of updating a catalog row every time a new value is requested. During orderly shutdown, the unused values are flushed and the catalog row is updated so that shutdown won't leak chunks of unused values. ii) The sequence generator always updates the catalog in a subtransaction of the user's current execution transaction, expecting to get the lock immediately. The write is committed immediately. If the update fails, then the generator raises a TOO_MUCH_CONTENTION exception. It is believed that this can only happen if someone is scanning SYS.SYSSEQUENCES directly, rather than using the SYSCS_UTIL.SYSCS_PEEK_AT_SEQUENCE function. In contrast, identity columns try to use a subtransaction, but if that doesn't work, they escalate to using the parent transaction. There is substantial lock contention because the volatile identity counter is maintained in a core catalog. MM> Scanning the old topics on this issue there seemed to be the following issues, could you comment MM> how your proposal affects identity column behavior for these: MM> 1) correctness issue DERBY-5493 (i think the answer is that sequences behave better than identity) It appears to me that DERBY-5493 and DERBY-5494 were both fixed. I am not aware of any outstanding correctness problems with sequence generators. MM> 2) is there a worst case where the change will reduce current identity performance if sequences are used, MM> likely some case with only preallocation=1. There was some discussion about performance difference of identity and sequences, MM> could you note if there is any downside to switching to sequences. I think the MM> areas of concern were performance with preallocation 1 being the worst case. There MM> were changes both in sequences and identity columns since that discussion so not MM> clear what the current state is. I don't think we've measured this. There's certainly a possibility that a degenerate preallocation range would result in reduced performance. I think that the big downside is the possibility of leaking chunks of unused identity values if the database crashes or is brought down ungracefully. MM> 3) is there a behavior change for "lost" values by changing to sequences? Other than the leakage mentioned above, I'm not aware of any behavior change for lost values. Thanks, -Rick
          Hide
          Rick Hillegas added a comment -

          Attaching derby-6542-01-ab-catalog.diff. This patch adds catalog support for backing identity columns with system-generated sequences. I am running tests now.

          This patch does the following:

          1) Creates a system-generated sequence when a table is created with an identity column and the database is at level 10.11 or higher. Note that you can't currently add an identity column to an existing table.

          2) Drop the system-generated sequence when the identity column is dropped.

          3) Drop the system-generated sequence when the table is dropped.

          4) Add a system-generated sequence for every identity column when hard-upgrading to 10.11.

          5) Add a new system function, SYSCS_UTIL.SYSCS_PEEK_AT_IDENTITY, when creating a 10.11 database or when hard-upgrading to 10.11.

          Touches the following files:

          -----------------

          M java/engine/org/apache/derby/iapi/sql/dictionary/TableDescriptor.java

          Add a method for assigning a unique name to the system-generated sequence.

          -----------------

          M java/storeless/org/apache/derby/impl/storeless/EmptyDictionary.java
          M java/engine/org/apache/derby/iapi/sql/dictionary/DataDictionary.java
          M java/engine/org/apache/derby/catalog/SystemProcedures.jav
          M java/engine/org/apache/derby/impl/sql/catalog/DataDictionaryImpl.java
          M java/engine/org/apache/derby/impl/sql/catalog/DD_Version.java

          Support for SYSCS_PEEK_AT_IDENTITY. Upgrade support.

          -----------------

          M java/tools/org/apache/derby/impl/tools/dblook/DB_Sequence.java

          dblook shouldn't create DDL for the system-generated sequences.

          -----------------

          M java/engine/org/apache/derby/iapi/types/TypeId.java
          M java/engine/org/apache/derby/iapi/types/DataTypeDescriptor.java

          Add support for computing the min/max bounds of an integer type.

          -----------------

          M java/engine/org/apache/derby/loc/messages.xml
          M java/shared/org/apache/derby/shared/common/reference/SQLState.java
          M java/engine/org/apache/derby/impl/sql/compile/NextSequenceNode.java

          Don't let users directly issue a NEXT VALUE FOR on a system-generated sequence.

          -----------------

          M java/engine/org/apache/derby/impl/sql/execute/CreateTableConstantAction.java

          Make a system-generated sequence for a new identity column.

          -----------------

          M java/engine/org/apache/derby/impl/sql/execute/DropTableConstantAction.java

          Drop the system-generated sequence when the table is dropped.

          -----------------

          M java/engine/org/apache/derby/impl/sql/execute/AlterTableConstantAction.java

          Drop and recreate the system-generated sequence when the identity column is modified. Drop the system generated sequence when the identity column is dropped.

          -----------------

          M java/testing/org/apache/derbyTesting/functionTests/tests/lang/RolesTest.java
          M java/testing/org/apache/derbyTesting/functionTests/tests/lang/_Suite.java
          M java/testing/org/apache/derbyTesting/functionTests/tests/lang/GrantRevokeDDLTest.java
          M java/testing/org/apache/derbyTesting/functionTests/tests/jdbc4/TestDbMetaData.java
          M java/testing/org/apache/derbyTesting/functionTests/master/ij7.out

          Adjust existing tests as necessary.

          -----------------

          A java/testing/org/apache/derbyTesting/functionTests/tests/lang/IdentitySequenceTest.java

          Basic test for catalog support of system-generated sequences.

          -----------------

          M java/testing/org/apache/derbyTesting/functionTests/tests/upgradeTests/Changes10_11.java
          M java/testing/org/apache/derbyTesting/functionTests/tests/upgradeTests/Changes10_2.java

          Basic upgrade tests for system-generated sequences.

          Show
          Rick Hillegas added a comment - Attaching derby-6542-01-ab-catalog.diff. This patch adds catalog support for backing identity columns with system-generated sequences. I am running tests now. This patch does the following: 1) Creates a system-generated sequence when a table is created with an identity column and the database is at level 10.11 or higher. Note that you can't currently add an identity column to an existing table. 2) Drop the system-generated sequence when the identity column is dropped. 3) Drop the system-generated sequence when the table is dropped. 4) Add a system-generated sequence for every identity column when hard-upgrading to 10.11. 5) Add a new system function, SYSCS_UTIL.SYSCS_PEEK_AT_IDENTITY, when creating a 10.11 database or when hard-upgrading to 10.11. Touches the following files: ----------------- M java/engine/org/apache/derby/iapi/sql/dictionary/TableDescriptor.java Add a method for assigning a unique name to the system-generated sequence. ----------------- M java/storeless/org/apache/derby/impl/storeless/EmptyDictionary.java M java/engine/org/apache/derby/iapi/sql/dictionary/DataDictionary.java M java/engine/org/apache/derby/catalog/SystemProcedures.jav M java/engine/org/apache/derby/impl/sql/catalog/DataDictionaryImpl.java M java/engine/org/apache/derby/impl/sql/catalog/DD_Version.java Support for SYSCS_PEEK_AT_IDENTITY. Upgrade support. ----------------- M java/tools/org/apache/derby/impl/tools/dblook/DB_Sequence.java dblook shouldn't create DDL for the system-generated sequences. ----------------- M java/engine/org/apache/derby/iapi/types/TypeId.java M java/engine/org/apache/derby/iapi/types/DataTypeDescriptor.java Add support for computing the min/max bounds of an integer type. ----------------- M java/engine/org/apache/derby/loc/messages.xml M java/shared/org/apache/derby/shared/common/reference/SQLState.java M java/engine/org/apache/derby/impl/sql/compile/NextSequenceNode.java Don't let users directly issue a NEXT VALUE FOR on a system-generated sequence. ----------------- M java/engine/org/apache/derby/impl/sql/execute/CreateTableConstantAction.java Make a system-generated sequence for a new identity column. ----------------- M java/engine/org/apache/derby/impl/sql/execute/DropTableConstantAction.java Drop the system-generated sequence when the table is dropped. ----------------- M java/engine/org/apache/derby/impl/sql/execute/AlterTableConstantAction.java Drop and recreate the system-generated sequence when the identity column is modified. Drop the system generated sequence when the identity column is dropped. ----------------- M java/testing/org/apache/derbyTesting/functionTests/tests/lang/RolesTest.java M java/testing/org/apache/derbyTesting/functionTests/tests/lang/_Suite.java M java/testing/org/apache/derbyTesting/functionTests/tests/lang/GrantRevokeDDLTest.java M java/testing/org/apache/derbyTesting/functionTests/tests/jdbc4/TestDbMetaData.java M java/testing/org/apache/derbyTesting/functionTests/master/ij7.out Adjust existing tests as necessary. ----------------- A java/testing/org/apache/derbyTesting/functionTests/tests/lang/IdentitySequenceTest.java Basic test for catalog support of system-generated sequences. ----------------- M java/testing/org/apache/derbyTesting/functionTests/tests/upgradeTests/Changes10_11.java M java/testing/org/apache/derbyTesting/functionTests/tests/upgradeTests/Changes10_2.java Basic upgrade tests for system-generated sequences.
          Hide
          Rick Hillegas added a comment -

          Tests passed cleanly for me on derby-6542-01-ab-catalog.diff.

          Show
          Rick Hillegas added a comment - Tests passed cleanly for me on derby-6542-01-ab-catalog.diff.
          Hide
          ASF subversion and git services added a comment -

          Commit 1589379 from Rick Hillegas in branch 'code/trunk'
          [ https://svn.apache.org/r1589379 ]

          DERBY-6542: Add catalog support for using sequence generators to implement identity columns; commit derby-6542-01-ab-catalog.diff.

          Show
          ASF subversion and git services added a comment - Commit 1589379 from Rick Hillegas in branch 'code/trunk' [ https://svn.apache.org/r1589379 ] DERBY-6542 : Add catalog support for using sequence generators to implement identity columns; commit derby-6542-01-ab-catalog.diff.
          Hide
          Rick Hillegas added a comment -

          Attaching derby-6542-02-ab-useNewSequenceGenerator.diff. This is a first rev of a patch which uses sequence generators in order to produce identity values. I need to write a battery of tests.

          Show
          Rick Hillegas added a comment - Attaching derby-6542-02-ab-useNewSequenceGenerator.diff. This is a first rev of a patch which uses sequence generators in order to produce identity values. I need to write a battery of tests.
          Hide
          Rick Hillegas added a comment -

          Attaching derby-6542-02-ac-useNewSequenceGenerator.diff. This is the patch after merging with the head of trunk in order to pick up the bug fixes in derby-6554-01-ad-bugfixes.diff.

          This patch will not pass the regression tests cleanly because of DERBY-6554. We have regression tests which create a table with an identity column and then, in the same transaction, populate it. With sequence-backed identity columns, this gives rise to the DERBY-6554 bug.

          Show
          Rick Hillegas added a comment - Attaching derby-6542-02-ac-useNewSequenceGenerator.diff. This is the patch after merging with the head of trunk in order to pick up the bug fixes in derby-6554-01-ad-bugfixes.diff. This patch will not pass the regression tests cleanly because of DERBY-6554 . We have regression tests which create a table with an identity column and then, in the same transaction, populate it. With sequence-backed identity columns, this gives rise to the DERBY-6554 bug.
          Hide
          Rick Hillegas added a comment -

          Attaching derby-6542-02-af-useNewSequenceGenerator.diff. This syncs the previous rev of the patch with the work done on DERBY-6554 and implements the correct ALTER TABLE behavior described on DERBY-6579. I am running tests now.

          Touches the following additional files:

          M java/testing/org/apache/derbyTesting/functionTests/tests/lang/AlterTableTest.java
          M java/testing/org/apache/derbyTesting/functionTests/tests/lang/AutoIncrementTest.java

          Show
          Rick Hillegas added a comment - Attaching derby-6542-02-af-useNewSequenceGenerator.diff. This syncs the previous rev of the patch with the work done on DERBY-6554 and implements the correct ALTER TABLE behavior described on DERBY-6579 . I am running tests now. Touches the following additional files: M java/testing/org/apache/derbyTesting/functionTests/tests/lang/AlterTableTest.java M java/testing/org/apache/derbyTesting/functionTests/tests/lang/AutoIncrementTest.java
          Hide
          Rick Hillegas added a comment -

          Tests passed cleanly for me on derby-6542-02-af-useNewSequenceGenerator.diff.

          Show
          Rick Hillegas added a comment - Tests passed cleanly for me on derby-6542-02-af-useNewSequenceGenerator.diff.
          Hide
          ASF subversion and git services added a comment -

          Commit 1596548 from Rick Hillegas in branch 'code/trunk'
          [ https://svn.apache.org/r1596548 ]

          DERBY-6542: Use sequence generators to create identity values; commit derby-6542-02-af-useNewSequenceGenerator.diff.

          Show
          ASF subversion and git services added a comment - Commit 1596548 from Rick Hillegas in branch 'code/trunk' [ https://svn.apache.org/r1596548 ] DERBY-6542 : Use sequence generators to create identity values; commit derby-6542-02-af-useNewSequenceGenerator.diff.
          Hide
          Rick Hillegas added a comment -

          Attaching derby-6542-03-aa-lockTimeoutForIdentityContention.diff. This patch makes Derby raise LockTimeout rather than TooMuchContention when SYS.SYSSEQUENCES can't be updated for an identity column. I am running tests now.

          This should reduce backward compatibility issues for legacy applications with identity columns which have been coded to catch LockTimeout and retry.

          Touches the following files:

          M java/engine/org/apache/derby/impl/sql/catalog/SequenceUpdater.java
          M java/testing/org/apache/derbyTesting/functionTests/tests/lang/IdentitySequenceTest.java

          Show
          Rick Hillegas added a comment - Attaching derby-6542-03-aa-lockTimeoutForIdentityContention.diff. This patch makes Derby raise LockTimeout rather than TooMuchContention when SYS.SYSSEQUENCES can't be updated for an identity column. I am running tests now. This should reduce backward compatibility issues for legacy applications with identity columns which have been coded to catch LockTimeout and retry. Touches the following files: M java/engine/org/apache/derby/impl/sql/catalog/SequenceUpdater.java M java/testing/org/apache/derbyTesting/functionTests/tests/lang/IdentitySequenceTest.java
          Hide
          ASF subversion and git services added a comment -

          Commit 1596619 from Rick Hillegas in branch 'code/trunk'
          [ https://svn.apache.org/r1596619 ]

          DERBY-6542: Raise LockTimeout when there is too much contention on an identity column; commit derby-6542-03-aa-lockTimeoutForIdentityContention.diff.

          Show
          ASF subversion and git services added a comment - Commit 1596619 from Rick Hillegas in branch 'code/trunk' [ https://svn.apache.org/r1596619 ] DERBY-6542 : Raise LockTimeout when there is too much contention on an identity column; commit derby-6542-03-aa-lockTimeoutForIdentityContention.diff.
          Hide
          Rick Hillegas added a comment -

          Attaching the first rev of a release note for this bug fix.

          Show
          Rick Hillegas added a comment - Attaching the first rev of a release note for this bug fix.
          Hide
          Rick Hillegas added a comment -

          Attaching a second rev of the release note for this issue. The second rev adds advice about using SYSCS_UTIL.SYSCS_PEEK_AT_IDENTITY to discover the next value which will be inserted into an identity column, rather than querying SYS.SYSCOLUMNS or SYS.SYSSEQUENCES.

          Show
          Rick Hillegas added a comment - Attaching a second rev of the release note for this issue. The second rev adds advice about using SYSCS_UTIL.SYSCS_PEEK_AT_IDENTITY to discover the next value which will be inserted into an identity column, rather than querying SYS.SYSCOLUMNS or SYS.SYSSEQUENCES.
          Hide
          Rick Hillegas added a comment -

          I believe I am done working on this issue. Mike's buddy-testing may disclose other flaws, which we can track with new JIRAs. Resolving this issue so that the detailed release note will appear in the 10.11 release notes.

          Show
          Rick Hillegas added a comment - I believe I am done working on this issue. Mike's buddy-testing may disclose other flaws, which we can track with new JIRAs. Resolving this issue so that the detailed release note will appear in the 10.11 release notes.

            People

            • Assignee:
              Rick Hillegas
              Reporter:
              Rick Hillegas
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development