Derby
  1. Derby
  2. DERBY-3330

provide support for unique constraint over keys that include one or more nullable columns.

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.4.1.3
    • Fix Version/s: 10.4.1.3
    • Component/s: SQL, Store
    • Labels:
      None
    • Environment:
      all
    • Issue & fix info:
      Release Note Needed

      Description

      Allow unique constraint over keys which include one or more nullable fields. Prior to this change Derby only supported unique constraints on keys that included no nullable columns. The new constraint will allow unlimited inserts of any key with one more null columns, but will limit insert of keys with no null columns to 1 unique value per table.

      There is no change to existing or newly created unique indexes on null columns (as opposed to unique constraints on null columns). Also there is no change to existing or newly created constraints on keys with no nullable columns.

      1. derby-3330.diff
        29 kB
        Anurag Shekhar
      2. derby-3330v2.diff
        38 kB
        Anurag Shekhar
      3. derby-3330v3.diff
        39 kB
        Anurag Shekhar
      4. derby-3330v4.diff
        39 kB
        Anurag Shekhar
      5. derby-3330-testcase.diff
        19 kB
        Anurag Shekhar
      6. FunctionalSpec_DERBY-3330.html
        7 kB
        Anurag Shekhar
      7. FunctionalSpec_DERBY-3330-V2.html
        7 kB
        Anurag Shekhar
      8. BTreeController.diff
        10 kB
        Anurag Shekhar
      9. UniqueConstraint_Implementation.html
        15 kB
        Anurag Shekhar
      10. derby-3330v5.diff
        84 kB
        Anurag Shekhar
      11. derby-3330v6.diff
        102 kB
        Anurag Shekhar
      12. derby-3330v7.diff
        103 kB
        Anurag Shekhar
      13. derby-3330v8.diff
        101 kB
        Anurag Shekhar
      14. UniqueConstraint_Implementation_V2.html
        12 kB
        Anurag Shekhar
      15. derby-3330v9.diff
        86 kB
        Anurag Shekhar
      16. UniqueConstraint_Implementation_V3.html
        10 kB
        Anurag Shekhar
      17. derby-3330v10.diff
        86 kB
        Anurag Shekhar
      18. derbyall_report.txt
        19 kB
        Mike Matrigali
      19. derby-3330v11.diff
        82 kB
        Anurag Shekhar
      20. db2Compatibility.diff
        2 kB
        Anurag Shekhar
      21. derby-3330v12.diff
        82 kB
        Anurag Shekhar
      22. derby-3330v13.diff
        88 kB
        Anurag Shekhar
      23. db2Compatibility-v2.diff
        3 kB
        Anurag Shekhar
      24. derby-3330-UpgradeTests.diff
        10 kB
        Anurag Shekhar
      25. derby-3330_followup_1.diff
        5 kB
        Anurag Shekhar
      26. UniqueConstraint_Implementation_V4.html
        13 kB
        Anurag Shekhar
      27. derby-3330_followup_1_modified.diff
        6 kB
        Mike Matrigali
      28. sortercomments.diff
        1 kB
        Anurag Shekhar

        Issue Links

        There are no Sub-Tasks for this issue.

          Activity

          Hide
          Anurag Shekhar added a comment -

          This issue started as a part of Derby-2212 but latter it was discussed that its safer to provide this functionality just by having unique constraint over null able columns and behavior of index is left unchanged.

          Please read comments in Derby-2212 for the background information about this issue.

          Show
          Anurag Shekhar added a comment - This issue started as a part of Derby-2212 but latter it was discussed that its safer to provide this functionality just by having unique constraint over null able columns and behavior of index is left unchanged. Please read comments in Derby-2212 for the background information about this issue.
          Hide
          Anurag Shekhar added a comment -

          I am following Mike's approach ie internally unique constraint will
          be backed by no unique index but during insert, checking the
          immediate left and right slot to find duplicate (if there is no null
          in the new key). If a duplicate key is found it the insert will be rejected.

          I will be introducing a new attribute in BTree to tell the insert routine
          that its not a non unique index but almost unique index. New
          attribute in BTree will be a persistant attribute so the file system will
          not be compatible with the older versions and will require a hard upgrade
          routine to migrate the old indexes to new version.

          While creating the constraint on a table with existing records
          merge short is perform to sort the keys before creating index. A
          new merge short class will be required to sort this almost unique
          index (allowing duplicates only if there is null in it).

          Show
          Anurag Shekhar added a comment - I am following Mike's approach ie internally unique constraint will be backed by no unique index but during insert, checking the immediate left and right slot to find duplicate (if there is no null in the new key). If a duplicate key is found it the insert will be rejected. I will be introducing a new attribute in BTree to tell the insert routine that its not a non unique index but almost unique index. New attribute in BTree will be a persistant attribute so the file system will not be compatible with the older versions and will require a hard upgrade routine to migrate the old indexes to new version. While creating the constraint on a table with existing records merge short is perform to sort the keys before creating index. A new merge short class will be required to sort this almost unique index (allowing duplicates only if there is null in it).
          Hide
          Anurag Shekhar added a comment -

          I am running the tests and will be updating the patch based on the test results.

          Description of patch
          modified files

          java/engine/org/apache/derby/impl/sql/compile/CreateIndexNode.java
          java/engine/org/apache/derby/impl/sql/compile/TableElementList.java
          java/engine/org/apache/derby/impl/sql/execute/GenericConstantActionFactory.java
          Added new methods to support almost unique index for unique constraint

          java/engine/org/apache/derby/impl/sql/execute/CreateIndexConstantAction.java
          Added new CreateIndexConstantAction method to support a new parameter (almost unique) for indexes. This property is stored and used while executing this action.
          execute method of this class uses this property to use AlmostUniqueIndexSortObserver and AlmostUniqueMergeSort. This two classes provide special sorting routine which considers two row as duplicate only if all the key parts are non null.

          java/engine/org/apache/derby/impl/store/access/sort/MergeSort.java
          changed the scope of few attributes to protected so that AlmostUniqueMergeSort can access them.

          java/engine/org/apache/derby/impl/store/access/sort/ExternalSortFactory.java
          removed final to allow AlmostUniqueExternalSortFactory to extend from it.
          moved creation of MergeSort to a protected method so that extending class
          can return a different class.

          java/engine/org/apache/derby/impl/store/access/btree/BTree.java
          added a new property almostUnique and getter and setters for the same.

          java/engine/org/apache/derby/impl/store/access/btree/BTreeController.java
          added a new private method to compare the record with immediate left and right records to check for duplicate.

          java/engine/org/apache/derby/impl/store/access/btree/index/B2I.java
          added code to store and retrieve new attribute "almostUnique" to the file system.

          java/engine/org/apache/derby/iapi/store/access/AccessFactoryGlobals.java
          added property string for AlmostUniqueExternalSortFactory.

          java/engine/org/apache/derby/modules.properties
          Added entry for AlmostUniqueExternalSortFactory.

          New files
          java/engine/org/apache/derby/impl/sql/execute/AlmostUniqueIndexSortObserver.java
          This class implements duplicate checking routine to reject non null duplicate keys.

          java/engine/org/apache/derby/impl/store/access/sort/AlmostUniqueExternalSortFactory.java
          This class extends from ExternalSortFactory and overrides getMergeSort methods to return AlmostUniqueMergeSort.

          java/engine/org/apache/derby/impl/store/access/sort/AlmostUniqueMergeSort.java
          This class extends MergeSort and overrides compare methods to ignore last keypart (location) while checking for duplicate keys.

          Show
          Anurag Shekhar added a comment - I am running the tests and will be updating the patch based on the test results. Description of patch modified files java/engine/org/apache/derby/impl/sql/compile/CreateIndexNode.java java/engine/org/apache/derby/impl/sql/compile/TableElementList.java java/engine/org/apache/derby/impl/sql/execute/GenericConstantActionFactory.java Added new methods to support almost unique index for unique constraint java/engine/org/apache/derby/impl/sql/execute/CreateIndexConstantAction.java Added new CreateIndexConstantAction method to support a new parameter (almost unique) for indexes. This property is stored and used while executing this action. execute method of this class uses this property to use AlmostUniqueIndexSortObserver and AlmostUniqueMergeSort. This two classes provide special sorting routine which considers two row as duplicate only if all the key parts are non null. java/engine/org/apache/derby/impl/store/access/sort/MergeSort.java changed the scope of few attributes to protected so that AlmostUniqueMergeSort can access them. java/engine/org/apache/derby/impl/store/access/sort/ExternalSortFactory.java removed final to allow AlmostUniqueExternalSortFactory to extend from it. moved creation of MergeSort to a protected method so that extending class can return a different class. java/engine/org/apache/derby/impl/store/access/btree/BTree.java added a new property almostUnique and getter and setters for the same. java/engine/org/apache/derby/impl/store/access/btree/BTreeController.java added a new private method to compare the record with immediate left and right records to check for duplicate. java/engine/org/apache/derby/impl/store/access/btree/index/B2I.java added code to store and retrieve new attribute "almostUnique" to the file system. java/engine/org/apache/derby/iapi/store/access/AccessFactoryGlobals.java added property string for AlmostUniqueExternalSortFactory. java/engine/org/apache/derby/modules.properties Added entry for AlmostUniqueExternalSortFactory. New files java/engine/org/apache/derby/impl/sql/execute/AlmostUniqueIndexSortObserver.java This class implements duplicate checking routine to reject non null duplicate keys. java/engine/org/apache/derby/impl/store/access/sort/AlmostUniqueExternalSortFactory.java This class extends from ExternalSortFactory and overrides getMergeSort methods to return AlmostUniqueMergeSort. java/engine/org/apache/derby/impl/store/access/sort/AlmostUniqueMergeSort.java This class extends MergeSort and overrides compare methods to ignore last keypart (location) while checking for duplicate keys.
          Hide
          Anurag Shekhar added a comment -

          I have done some cleanup in this patch. This patch also includes fixes based on the failures in regression tests. I still have failures in regression tests. I will posting another patch after fixing them.

          This patch has following changes since the last time.

          1. Modified IndexRowGenerator to keep the information if this represents an almost unique index. This information is used by IndexChanger to deffer the inserts for update in case of almost unique index (in addition to unique index).

          2. Modified BTreeControler.compareLeftAndRightSiblings to use newly introduced methods to get the record at left and right. These two methods check for deleted record and continue to move to previous (in case of right next) record till it finds a valid record.

          I have also identified the code changed required for soft upgrade (marked with comments)

          Show
          Anurag Shekhar added a comment - I have done some cleanup in this patch. This patch also includes fixes based on the failures in regression tests. I still have failures in regression tests. I will posting another patch after fixing them. This patch has following changes since the last time. 1. Modified IndexRowGenerator to keep the information if this represents an almost unique index. This information is used by IndexChanger to deffer the inserts for update in case of almost unique index (in addition to unique index). 2. Modified BTreeControler.compareLeftAndRightSiblings to use newly introduced methods to get the record at left and right. These two methods check for deleted record and continue to move to previous (in case of right next) record till it finds a valid record. I have also identified the code changed required for soft upgrade (marked with comments)
          Hide
          Anurag Shekhar added a comment -

          This patches fixes all the failures in test suite cased by this
          work except for upgrades tests. I am still having failures and
          errors in upgrade tests because this patch introduces extra
          attribute in BTree and IndexRowGenerator.

          II am also seeing one failure in
          org.apache.derbyTesting.functionTests.tests.lang.PrimaryKeyTest.testKeyConstraintsImpliesNotNull but I am getting this error in fresh checkout too.

          I have made changes nist.dml019.out in group by tests.
          Currently Unique constraint is backed by unique index
          and group by clause on unique index internally uses distinct
          scan resulting in sorted results. After changing backing
          index from unique index to almost unique index distinct
          index wasn't used and results appear in random order (similar
          to group by clause on non unique indexes). I have modified the
          out file to match the output of select with group by clause.

          This results in deviation from current behavior but group by
          without order by clause need not be sorted so portable
          application will not be using this behavior or derby. Derby
          manuals also don't promise this behavior.

          This change in behavior doesn't mean that nist test is failing
          as the test iis supposed to be checking only for the number
          of rows and not their order.

          Show
          Anurag Shekhar added a comment - This patches fixes all the failures in test suite cased by this work except for upgrades tests. I am still having failures and errors in upgrade tests because this patch introduces extra attribute in BTree and IndexRowGenerator. II am also seeing one failure in org.apache.derbyTesting.functionTests.tests.lang.PrimaryKeyTest.testKeyConstraintsImpliesNotNull but I am getting this error in fresh checkout too. I have made changes nist.dml019.out in group by tests. Currently Unique constraint is backed by unique index and group by clause on unique index internally uses distinct scan resulting in sorted results. After changing backing index from unique index to almost unique index distinct index wasn't used and results appear in random order (similar to group by clause on non unique indexes). I have modified the out file to match the output of select with group by clause. This results in deviation from current behavior but group by without order by clause need not be sorted so portable application will not be using this behavior or derby. Derby manuals also don't promise this behavior. This change in behavior doesn't mean that nist test is failing as the test iis supposed to be checking only for the number of rows and not their order.
          Hide
          Anurag Shekhar added a comment -

          derby-3330-testcase.diff has a test case to test unique constraint on nullable column.

          change between derby-3330v3.diff derby-3330v4.diff
          Fixed one bug related to deleted keys exposed by the above test case.

          Show
          Anurag Shekhar added a comment - derby-3330-testcase.diff has a test case to test unique constraint on nullable column. change between derby-3330v3.diff derby-3330v4.diff Fixed one bug related to deleted keys exposed by the above test case.
          Hide
          Anurag Shekhar added a comment -

          Revised functional spec from derby-2212

          Show
          Anurag Shekhar added a comment - Revised functional spec from derby-2212
          Hide
          Mike Matrigali added a comment -

          here are some initial comments on the derby-3330v4.diff patch

          overall:
          o almost unique, doesn't seem like a good name for this. And I didn't see
          good documentation in the code explaining what this means. Unfortunately
          could not think of something less wordy than AllowMulipleNullsInUnique.

          o not sure what you are using for tab/spaces, see derby web site for following
          existing conventions in the code. may seem minor but inconsistent tab/indent
          makes stuff unreadable. Derby code that has tab indentation expects 4 spaces
          for those tabs.

          java/engine/org/apache/derby/impl/sql/compile/CreateIndexNode.java
          o no comments explaining new parameter and what false means.
          CreateindexConstantAction also has no comments about the parameter either.

          java/engine/org/apache/derby/impl/sql/execute/CreateIndexConstantAction.java
          o inconsistent formatting with rest of file for new CreateIndexConstantAction
          routine, so does not indent same as rest.

          o Why did you add whole new routine rather than add single parameter to
          existing routine and update other references. I think it is really error
          prone to have 2 routines with same name and just differ by 12 or 13
          params.

          o no comments about almostUnique, some doc in the CreateIndexConstantAction
          call and some comment following the existing code when property is set up
          would be appropriate.

          o This line looks unused:
          Properties sortProperty = null;
          it looks later like you use "properties" but other code seems to think that
          this may be null sometimes.

          o the tabbing/indentation inconsistency makes the sort part almost unreadable.
          no comments for added sort logic.

          java/engine/org/apache/derby/impl/sql/compile/TableElementList.java
          o bad indentation
          o looks like you removed check for null but left comments that we would check
          for null.
          o again no added comments for all changes to this file.

          o need some explanation of what the added code is doing here. What is the
          purpose of the else part of if (constraintDN.constraintType ==
          DataDictionary.UNIQUE_CONSTRAINT)?

          java/engine/org/apache/derby/impl/sql/execute/IndexChanger.java
          o no comments in changed code

          java/engine/org/apache/derby/impl/sql/execute/GenericConstantActionFactory.java
          o bad indentation
          o no comments

          java/engine/org/apache/derby/impl/sql/catalog/DataDictionaryImpl.java
          o bad indentation
          o todo left in code, looks like needs to be resolved before checkin

          java/engine/org/apache/derby/impl/store/access/sort/ExternalSortFactory.java
          o bad indentation
          o need some documentation on what changes to mergesort are doing.

          java/engine/org/apache/derby/impl/store/access/btree/BTree.java
          o bad indentation
          o no comments for any change to file

          java/engine/org/apache/derby/impl/store/access/btree/BTreeController.java
          o a bunch of imports added that look unnecessary, ie. Time, TimeStampe, Calendar, ...
          o brace style inconsistent with rest of routine.

          o getPreviousRecord()
          o what is the expected behavior on latch waits?
          o Does not follow latch/latch deadlock avoidance protocol - must not wait on
          latch while holding latch while traveling left in tree.
          o no handling of getting and releasing latches during search.
          o see B2IRowLocking3.java!searchLeftAndLockPreviousKey for example of code
          that searches left in btree.
          o What kind of locking will the various isolation levels need for your checking
          of left and right keys? Is holding the latch while checking good enough? I
          don't know right now.
          o note the trickiest part of this is if you have to give up the latch while
          waiting then you have to restart entire search of the tree to find where to
          do the insert because you don't have latch.
          o remember that you may have to search left many pages to find a data row.

          o i don't think you should be searching for "undeleted", deleted records
          still count toward the integrity of the tree. Without getting locks it
          is impossible to know if the records marked deleted are committed deleted
          or not. Imagine if you skip a marked deleted record that would have caused
          a constraint violation, and then that xact backs out it would then mark
          the row as valid and now you would have 2 records that fail the constraint.

          getNextRecord()
          o what is expected behavior of latch waits?
          o what is expected behavior if routine has to visit N right leaves (ie. will
          routine hold all latches until end of transaction)?
          o as in getPreviousRecord() i think it should not special case deleted rows
          o remember you may have to search right many pages to find row in edge case.

          compareLeftAndRightSiblings()
          o likely code to deal with lost latches would go into this routine and callers.
          o no need to call runtime_mem.get_template(getRawTran()) twice. Once you
          call it you have a template. Just reuse the template.

          o there should be some comments somewhere explaining why just checking left
          and right siblings works for what you are trying to achieve.

          o i have to think about it some more but the places you added the
          if (compareLeftAndRightSiblings(rowToInsert,
          insert_slot, targetleaf))
          return ConglomerateController.ROWISDUPLICATE;
          don't seem optimal. I would like to see the code only called in the
          almost unique case, so we don't affect the behavior of existing indexes at
          all.

          java/engine/org/apache/derby/impl/store/access/btree/index/B2I.java
          o it is critical to properly document changes to ondisk format of store objects,
          doesn't look like any work here done. I know from existing comments that
          upgrade still does not work so maybe you are planning more here.

          o The model has been to add new stuff at the end rather than in the middle.

          o so far I haven't seen what is going to stop this from being created in soft
          upgrade.

          java/engine/org/apache/derby/modules.properties
          o did you consider just altering the existing sort to take an additional
          startup parameter rather than extending and creating a new module to sort?
          just would be interested to know why one approach vs. the other.

          java/engine/org/apache/derby/catalog/types/IndexDescriptorImpl.java
          o bad indent in places
          o as todo's point out, need soft upgrade support

          Show
          Mike Matrigali added a comment - here are some initial comments on the derby-3330v4.diff patch overall: o almost unique, doesn't seem like a good name for this. And I didn't see good documentation in the code explaining what this means. Unfortunately could not think of something less wordy than AllowMulipleNullsInUnique. o not sure what you are using for tab/spaces, see derby web site for following existing conventions in the code. may seem minor but inconsistent tab/indent makes stuff unreadable. Derby code that has tab indentation expects 4 spaces for those tabs. java/engine/org/apache/derby/impl/sql/compile/CreateIndexNode.java o no comments explaining new parameter and what false means. CreateindexConstantAction also has no comments about the parameter either. java/engine/org/apache/derby/impl/sql/execute/CreateIndexConstantAction.java o inconsistent formatting with rest of file for new CreateIndexConstantAction routine, so does not indent same as rest. o Why did you add whole new routine rather than add single parameter to existing routine and update other references. I think it is really error prone to have 2 routines with same name and just differ by 12 or 13 params. o no comments about almostUnique, some doc in the CreateIndexConstantAction call and some comment following the existing code when property is set up would be appropriate. o This line looks unused: Properties sortProperty = null; it looks later like you use "properties" but other code seems to think that this may be null sometimes. o the tabbing/indentation inconsistency makes the sort part almost unreadable. no comments for added sort logic. java/engine/org/apache/derby/impl/sql/compile/TableElementList.java o bad indentation o looks like you removed check for null but left comments that we would check for null. o again no added comments for all changes to this file. o need some explanation of what the added code is doing here. What is the purpose of the else part of if (constraintDN.constraintType == DataDictionary.UNIQUE_CONSTRAINT)? java/engine/org/apache/derby/impl/sql/execute/IndexChanger.java o no comments in changed code java/engine/org/apache/derby/impl/sql/execute/GenericConstantActionFactory.java o bad indentation o no comments java/engine/org/apache/derby/impl/sql/catalog/DataDictionaryImpl.java o bad indentation o todo left in code, looks like needs to be resolved before checkin java/engine/org/apache/derby/impl/store/access/sort/ExternalSortFactory.java o bad indentation o need some documentation on what changes to mergesort are doing. java/engine/org/apache/derby/impl/store/access/btree/BTree.java o bad indentation o no comments for any change to file java/engine/org/apache/derby/impl/store/access/btree/BTreeController.java o a bunch of imports added that look unnecessary, ie. Time, TimeStampe, Calendar, ... o brace style inconsistent with rest of routine. o getPreviousRecord() o what is the expected behavior on latch waits? o Does not follow latch/latch deadlock avoidance protocol - must not wait on latch while holding latch while traveling left in tree. o no handling of getting and releasing latches during search. o see B2IRowLocking3.java!searchLeftAndLockPreviousKey for example of code that searches left in btree. o What kind of locking will the various isolation levels need for your checking of left and right keys? Is holding the latch while checking good enough? I don't know right now. o note the trickiest part of this is if you have to give up the latch while waiting then you have to restart entire search of the tree to find where to do the insert because you don't have latch. o remember that you may have to search left many pages to find a data row. o i don't think you should be searching for "undeleted", deleted records still count toward the integrity of the tree. Without getting locks it is impossible to know if the records marked deleted are committed deleted or not. Imagine if you skip a marked deleted record that would have caused a constraint violation, and then that xact backs out it would then mark the row as valid and now you would have 2 records that fail the constraint. getNextRecord() o what is expected behavior of latch waits? o what is expected behavior if routine has to visit N right leaves (ie. will routine hold all latches until end of transaction)? o as in getPreviousRecord() i think it should not special case deleted rows o remember you may have to search right many pages to find row in edge case. compareLeftAndRightSiblings() o likely code to deal with lost latches would go into this routine and callers. o no need to call runtime_mem.get_template(getRawTran()) twice. Once you call it you have a template. Just reuse the template. o there should be some comments somewhere explaining why just checking left and right siblings works for what you are trying to achieve. o i have to think about it some more but the places you added the if (compareLeftAndRightSiblings(rowToInsert, insert_slot, targetleaf)) return ConglomerateController.ROWISDUPLICATE; don't seem optimal. I would like to see the code only called in the almost unique case, so we don't affect the behavior of existing indexes at all. java/engine/org/apache/derby/impl/store/access/btree/index/B2I.java o it is critical to properly document changes to ondisk format of store objects, doesn't look like any work here done. I know from existing comments that upgrade still does not work so maybe you are planning more here. o The model has been to add new stuff at the end rather than in the middle. o so far I haven't seen what is going to stop this from being created in soft upgrade. java/engine/org/apache/derby/modules.properties o did you consider just altering the existing sort to take an additional startup parameter rather than extending and creating a new module to sort? just would be interested to know why one approach vs. the other. java/engine/org/apache/derby/catalog/types/IndexDescriptorImpl.java o bad indent in places o as todo's point out, need soft upgrade support
          Hide
          Mike Matrigali added a comment -

          comments of functional spec/hard upgrade section:
          It says:
          During had upgrade existing backing indexes will be dropped and recreated as almost unique indexes. It will be possible to create unique constraint over nullable column in upgraded data base. After hard upgrade user will able to create unique constraint over nullable columns and will also be able to drop not null constraints even if the column is part of a unique constraint.

          This is not what I expected and I believe is unnecessary and not a good idea. Because the feature only affects constraints on
          nullable columns which could never exist before the upgrade there should be nothing necessary to drop/recreate. Once a hard
          upgrade has happenned then the new backing index for unique constraint over null columns will be enabled. Nothing should
          "automatically" get dropped and recreated as an almost unique index.

          Also there are still a number of gramatical and spelling errors in the functional spec.

          Show
          Mike Matrigali added a comment - comments of functional spec/hard upgrade section: It says: During had upgrade existing backing indexes will be dropped and recreated as almost unique indexes. It will be possible to create unique constraint over nullable column in upgraded data base. After hard upgrade user will able to create unique constraint over nullable columns and will also be able to drop not null constraints even if the column is part of a unique constraint. This is not what I expected and I believe is unnecessary and not a good idea. Because the feature only affects constraints on nullable columns which could never exist before the upgrade there should be nothing necessary to drop/recreate. Once a hard upgrade has happenned then the new backing index for unique constraint over null columns will be enabled. Nothing should "automatically" get dropped and recreated as an almost unique index. Also there are still a number of gramatical and spelling errors in the functional spec.
          Hide
          Anurag Shekhar added a comment -

          Fixed typos in functional specs.

          Show
          Anurag Shekhar added a comment - Fixed typos in functional specs.
          Hide
          Anurag Shekhar added a comment -

          thanks Mike for looking at the patch and detailed comments
          I am explaining few points in this comment and will get back
          about rest of the comments

          overall:
          o almost unique, doesn't seem like a good name for this. And I didn't see
          good documentation in the code explaining what this means. Unfortunately
          could not think of something less wordy than AllowMulipleNullsInUnique.

          Shall I change it to UniqueWhenNotNull ?

          o not sure what you are using for tab/spaces, see derby web site for following
          existing conventions in the code. may seem minor but inconsistent tab/indent
          makes stuff unreadable. Derby code that has tab indentation expects 4 spaces
          for those tabs.

          I will check the indentations and fix them.

          o i have to think about it some more but the places you added the
          if (compareLeftAndRightSiblings(rowToInsert,
          insert_slot, targetleaf))
          return ConglomerateController.ROWISDUPLICATE;
          don't seem optimal. I would like to see the code only called in the
          almost unique case, so we don't affect the behavior of existing indexes at
          all.

          I will change it to check for new index before calling compareLeftAndRightSiblings.

          java/engine/org/apache/derby/impl/store/access/btree/index/B2I.java
          o it is critical to properly document changes to ondisk format of store objects,
          doesn't look like any work here done. I know from existing comments that
          upgrade still does not work so maybe you are planning more here.

          o The model has been to add new stuff at the end rather than in the middle.

          I will be fixing it in my new patch.

          o so far I haven't seen what is going to stop this from being created in soft
          upgrade.

          I am adding a new class B2I_10_3 which will be used in case of soft upgrades.
          I will add comments about disk format.

          java/engine/org/apache/derby/modules.properties
          o did you consider just altering the existing sort to take an additional
          startup parameter rather than extending and creating a new module to sort?
          just would be interested to know why one approach vs. the other.

          The sorting will be required while creating the constraint only. For rest of the
          execution it acts like a non unique index. By adding a new module I felt I won't
          be touching the execution path of other functionality (like select) and they will
          retain the their performance.
          But I can merge this code with exiting sorting routine if you feel that will be better
          way of handing it.

          Show
          Anurag Shekhar added a comment - thanks Mike for looking at the patch and detailed comments I am explaining few points in this comment and will get back about rest of the comments overall: o almost unique, doesn't seem like a good name for this. And I didn't see good documentation in the code explaining what this means. Unfortunately could not think of something less wordy than AllowMulipleNullsInUnique. Shall I change it to UniqueWhenNotNull ? o not sure what you are using for tab/spaces, see derby web site for following existing conventions in the code. may seem minor but inconsistent tab/indent makes stuff unreadable. Derby code that has tab indentation expects 4 spaces for those tabs. I will check the indentations and fix them. o i have to think about it some more but the places you added the if (compareLeftAndRightSiblings(rowToInsert, insert_slot, targetleaf)) return ConglomerateController.ROWISDUPLICATE; don't seem optimal. I would like to see the code only called in the almost unique case, so we don't affect the behavior of existing indexes at all. I will change it to check for new index before calling compareLeftAndRightSiblings. java/engine/org/apache/derby/impl/store/access/btree/index/B2I.java o it is critical to properly document changes to ondisk format of store objects, doesn't look like any work here done. I know from existing comments that upgrade still does not work so maybe you are planning more here. o The model has been to add new stuff at the end rather than in the middle. I will be fixing it in my new patch. o so far I haven't seen what is going to stop this from being created in soft upgrade. I am adding a new class B2I_10_3 which will be used in case of soft upgrades. I will add comments about disk format. java/engine/org/apache/derby/modules.properties o did you consider just altering the existing sort to take an additional startup parameter rather than extending and creating a new module to sort? just would be interested to know why one approach vs. the other. The sorting will be required while creating the constraint only. For rest of the execution it acts like a non unique index. By adding a new module I felt I won't be touching the execution path of other functionality (like select) and they will retain the their performance. But I can merge this code with exiting sorting routine if you feel that will be better way of handing it.
          Hide
          Anurag Shekhar added a comment -

          Thanks Mike for pointing out issues related to locking and also about the way
          I have handled deleted row.

          I went through the code you have have asked me to look for reference and also the doIns code. This is what I think I should be doing.

          Step 1. Find the slot where new row can be inserted.
          Step 2. Get the left row (without checking for deleted row) if there is a match try to
          hold a lock. If lock is obtained without releasing the latch no other transaction is
          working on it and it should be a deleted row. return error if its not.
          If latch was released tree has under gone some changes and start with step 1.

          Step3. Repeat step 2 for right row.

          I am checking out B2IRowLocking3.searchLeftAndLockPreviousKey and will be
          posting about locking scheme and effect of isolation level shortly.

          Show
          Anurag Shekhar added a comment - Thanks Mike for pointing out issues related to locking and also about the way I have handled deleted row. I went through the code you have have asked me to look for reference and also the doIns code. This is what I think I should be doing. Step 1. Find the slot where new row can be inserted. Step 2. Get the left row (without checking for deleted row) if there is a match try to hold a lock. If lock is obtained without releasing the latch no other transaction is working on it and it should be a deleted row. return error if its not. If latch was released tree has under gone some changes and start with step 1. Step3. Repeat step 2 for right row. I am checking out B2IRowLocking3.searchLeftAndLockPreviousKey and will be posting about locking scheme and effect of isolation level shortly.
          Hide
          Mike Matrigali added a comment -

          At this point you should somehow document the implementation that you are
          planning to support the functional spec. That way I can comment on whether
          or not I think it should work. It is hard just going from your patch with
          almost no comments. I suggested the approach, and am having a hard time
          remembering at this point - imagine what other reviewers are going through.

          With regard to this specific question, it depends on what rows you are
          going to support in the new tree. To make it easier to talk about let's
          just consider the simplest case of a single key index. Then with respect
          to "deleted" rows will you ever support duplicate entries in the tree,
          as below:

          deleted bit key rowlocation
          ----------- — -----------
          false null 1,7
          false null 1,8
          true a 1,9
          false a 1,10

          I think the algorithms and edge cases are probably easier if you keep the
          rule that disregarding the deleted bit, there can never be a duplicate non-null
          key in the tree. This makes it easier to deal with uncommitted transactions and
          duplicates and deleted rows. If you pick this rule then you should read and
          understand the code in BTreeController that deals with this case in the
          old indexes. This code starts with the comment:
          // If the row is there already, simply undelete it.
          // The rationale for this is, since the index does
          // not support duplicates, the only way we could ...

          Remember that getting a lock on a deleted row does not mean it is a committed
          deleted row, it may have been deleted by the current transaction.

          Probably the most straightforward is to follow what we do currently for
          unique indexes, but apply it to this new index which allows duplicate null's
          but non duplicate non-null.

          I don't know the exact answer on locking, it is complicated as we have never locked
          rows to the right of a row during an insert before. This can have some negative concurrency
          aspects.

          Show
          Mike Matrigali added a comment - At this point you should somehow document the implementation that you are planning to support the functional spec. That way I can comment on whether or not I think it should work. It is hard just going from your patch with almost no comments. I suggested the approach, and am having a hard time remembering at this point - imagine what other reviewers are going through. With regard to this specific question, it depends on what rows you are going to support in the new tree. To make it easier to talk about let's just consider the simplest case of a single key index. Then with respect to "deleted" rows will you ever support duplicate entries in the tree, as below: deleted bit key rowlocation ----------- — ----------- false null 1,7 false null 1,8 true a 1,9 false a 1,10 I think the algorithms and edge cases are probably easier if you keep the rule that disregarding the deleted bit, there can never be a duplicate non-null key in the tree. This makes it easier to deal with uncommitted transactions and duplicates and deleted rows. If you pick this rule then you should read and understand the code in BTreeController that deals with this case in the old indexes. This code starts with the comment: // If the row is there already, simply undelete it. // The rationale for this is, since the index does // not support duplicates, the only way we could ... Remember that getting a lock on a deleted row does not mean it is a committed deleted row, it may have been deleted by the current transaction. Probably the most straightforward is to follow what we do currently for unique indexes, but apply it to this new index which allows duplicate null's but non duplicate non-null. I don't know the exact answer on locking, it is complicated as we have never locked rows to the right of a row during an insert before. This can have some negative concurrency aspects.
          Hide
          Anurag Shekhar added a comment -

          Thanks Mike in taking time to explain the nuances of locking in such a clear and detailed manner. I will shortly post a document explaining the following issues related to my patch
          1.How insert is handled.
          2.How sorting is performed (while creating the unique constraint).
          3.How soft upgrade will work.

          I have written the routine to perform comparison while inserting. this routine is called only if all the parts of the new key are non null. This routine is called for both keys (left and right of the position identified to insert)

          1.If no match return
          2.if there is match check if its already deleted
          3.if not deleted there return a match (will result in exception)
          4.if deleted try getting a lock on it
          5.if lock is obtained without releasing the latch (means without waiting) the key was deleted withing current transaction.
          6.if lock is obtained and latch is released the lock was held by some other transaction and might have caused change in tree. Insert process will be restarted again to locate the position for new key.

          While fetching the left and right records I am not checking for deleted bit any more but checking for it only after finding a match. The fetched record is locked only if a match is found and its a deleted record and in that case also the lock will be held only if the key was deleted within same transaction. In case its deleted from another transaction page latch will be released and it will ultimately result in rescanning the tree for insert position.

          I am attaching a diff for BtreeController.java (I haven't yet implemented the left locking protocol) Please let me know If I am doing it right.

          Show
          Anurag Shekhar added a comment - Thanks Mike in taking time to explain the nuances of locking in such a clear and detailed manner. I will shortly post a document explaining the following issues related to my patch 1.How insert is handled. 2.How sorting is performed (while creating the unique constraint). 3.How soft upgrade will work. I have written the routine to perform comparison while inserting. this routine is called only if all the parts of the new key are non null. This routine is called for both keys (left and right of the position identified to insert) 1.If no match return 2.if there is match check if its already deleted 3.if not deleted there return a match (will result in exception) 4.if deleted try getting a lock on it 5.if lock is obtained without releasing the latch (means without waiting) the key was deleted withing current transaction. 6.if lock is obtained and latch is released the lock was held by some other transaction and might have caused change in tree. Insert process will be restarted again to locate the position for new key. While fetching the left and right records I am not checking for deleted bit any more but checking for it only after finding a match. The fetched record is locked only if a match is found and its a deleted record and in that case also the lock will be held only if the key was deleted within same transaction. In case its deleted from another transaction page latch will be released and it will ultimately result in rescanning the tree for insert position. I am attaching a diff for BtreeController.java (I haven't yet implemented the left locking protocol) Please let me know If I am doing it right.
          Hide
          Anurag Shekhar added a comment -

          Attaching the implementation documents. I will merge it with the functional spec after reviews.

          Show
          Anurag Shekhar added a comment - Attaching the implementation documents. I will merge it with the functional spec after reviews.
          Hide
          Anurag Shekhar added a comment -

          This patch includes following changes since v4
          1. Fixed Indentation issues
          2. Added comments for new methods attributes and some code changes
          3. Added code for soft upgrade
          4.Removed unnecessary imports.
          5. Modified left and right sibling comparison (details to follow).
          6. Moved new attribute to end of the BTree file format.
          7. Added comments about new file format.
          8. Added new classes to support soft upgrade (as described in implementation spec).

          Pending task
          Hard upgrade

          Show
          Anurag Shekhar added a comment - This patch includes following changes since v4 1. Fixed Indentation issues 2. Added comments for new methods attributes and some code changes 3. Added code for soft upgrade 4.Removed unnecessary imports. 5. Modified left and right sibling comparison (details to follow). 6. Moved new attribute to end of the BTree file format. 7. Added comments about new file format. 8. Added new classes to support soft upgrade (as described in implementation spec). Pending task Hard upgrade
          Hide
          Anurag Shekhar added a comment -

          Details about changes in java/engine/org/apache/derby/impl/store/access/btree/BTreeController.java

          The new code doesn't checks for deleted records. So while looking for left and right sibling there won't be any need to traverse multiple pages (at the most one pages to left if the new slot is 1st slot of the page or one page of the right if the new slot is the last slot of the page will be latched).
          Steps for duplicate detection is as follows

          1. Get the previous record (may or may not be from same page)
          2. if there the two record are different unlatch page and return NO_MATCH
          3. if these to match obtain the lock on exiting record
          4. if lock is obtained after wait (latch lost) tree might have been modified start with locating he slot of new record
          5. If the locked record is deleted return no match else return match

          do the above for record at right again.

          Show
          Anurag Shekhar added a comment - Details about changes in java/engine/org/apache/derby/impl/store/access/btree/BTreeController.java The new code doesn't checks for deleted records. So while looking for left and right sibling there won't be any need to traverse multiple pages (at the most one pages to left if the new slot is 1st slot of the page or one page of the right if the new slot is the last slot of the page will be latched). Steps for duplicate detection is as follows 1. Get the previous record (may or may not be from same page) 2. if there the two record are different unlatch page and return NO_MATCH 3. if these to match obtain the lock on exiting record 4. if lock is obtained after wait (latch lost) tree might have been modified start with locating he slot of new record 5. If the locked record is deleted return no match else return match do the above for record at right again.
          Hide
          Anurag Shekhar added a comment -

          Impact of isolation level

          The records involved (new record and duplicate ) are always locked for update
          which insures that there is exclusive lock on the record , irrespective of the isolation.
          So nothing special need to be done with respect to isolation level.

          Show
          Anurag Shekhar added a comment - Impact of isolation level The records involved (new record and duplicate ) are always locked for update which insures that there is exclusive lock on the record , irrespective of the isolation. So nothing special need to be done with respect to isolation level.
          Hide
          Mike Matrigali added a comment -

          o although unlikely a legal state for a leaf page is to have a control row, but no other rows. It can
          get this way if the deleted space background thread is able to reclaim the committed deleted rows, but can't get table level lock to merge the empty leaf page. Thus to handle this one must be ready when scanning left and/or right to the next page looking for a row to need to visit multiple
          pages.

          o with respect to isolation level. What kind of locking will you be doing on the rows that you are
          checking left and right? If you are locking those rows, how long will you hold those locks? The
          search to the right is especially interesting as previously we would never have locked any row
          to the right of an insert. With previous key locking used for isolation level implementation the code
          currently may or may not lock the left key, depending on isolation level.

          Show
          Mike Matrigali added a comment - o although unlikely a legal state for a leaf page is to have a control row, but no other rows. It can get this way if the deleted space background thread is able to reclaim the committed deleted rows, but can't get table level lock to merge the empty leaf page. Thus to handle this one must be ready when scanning left and/or right to the next page looking for a row to need to visit multiple pages. o with respect to isolation level. What kind of locking will you be doing on the rows that you are checking left and right? If you are locking those rows, how long will you hold those locks? The search to the right is especially interesting as previously we would never have locked any row to the right of an insert. With previous key locking used for isolation level implementation the code currently may or may not lock the left key, depending on isolation level.
          Hide
          Anurag Shekhar added a comment -

          Thanks Mike for pointing out the possibility of having empty page and hence need to scan multiple page.
          I will reintroduce the code for the same. I will add code to keep moving left (or right) till a row is found or no more leaf is available.

          Locking of left or right row is attempted only if there is a duplicate is found. In that case
          a. The transaction was deleted in same transaction – So its already locked within same transaction
          b. Is deleted in some other transaction - It will result in losing the latch and a rescan will performed to identify the slot (no lock)

          c. Is an valid row - an exception will be thrown and lock will be released immediately.

          Irrespective of the isolation level only update lock is requested. But the actual locking is performed only in case c and that too for a very short time, I think it won't have much impact on performance.

          Show
          Anurag Shekhar added a comment - Thanks Mike for pointing out the possibility of having empty page and hence need to scan multiple page. I will reintroduce the code for the same. I will add code to keep moving left (or right) till a row is found or no more leaf is available. Locking of left or right row is attempted only if there is a duplicate is found. In that case a. The transaction was deleted in same transaction – So its already locked within same transaction b. Is deleted in some other transaction - It will result in losing the latch and a rescan will performed to identify the slot (no lock) c. Is an valid row - an exception will be thrown and lock will be released immediately. Irrespective of the isolation level only update lock is requested. But the actual locking is performed only in case c and that too for a very short time, I think it won't have much impact on performance.
          Hide
          Anurag Shekhar added a comment -

          This patch includes hard upgrade (Upgrade test from 10.3.1.4 and 10.1.3.1runs without fails).

          Description of Hard upgrade

          Hard upgrade makes following changes
          1. Updates B2I class by adding extra attribute (uniquewhenNotNull)
          2. Updates format id.
          3. Updates ConglomerateDescriptor of all updated Conglomerates to
          user new version of IndexDescriptorImpl

          Unique index meant to be backing index of UniqueConstraint is updated to
          become UniqueWhenNotNull index (nUniqueColumn is set to nKeyFields - 1
          and uniquewhen not null is set to true). For other indexes uniqueWhenNotNull
          attribute is set to false.

          Hard upgrade is initiated from DataDictionaryImpl.upgradeConglomerate
          (new method). This method calles another newly intoroduced method
          getAllConglomerates to get the list of conglomerate and updates them
          by calling TransaCtioncontroller.updateUniquenessOfConglomerate
          which internally calls Conglomerate.updateUniqueness to
          conglomerate attributes.

          I will be posting one more patch after introducing multipage page
          scan while checking for duplicates.

          Show
          Anurag Shekhar added a comment - This patch includes hard upgrade (Upgrade test from 10.3.1.4 and 10.1.3.1runs without fails). Description of Hard upgrade Hard upgrade makes following changes 1. Updates B2I class by adding extra attribute (uniquewhenNotNull) 2. Updates format id. 3. Updates ConglomerateDescriptor of all updated Conglomerates to user new version of IndexDescriptorImpl Unique index meant to be backing index of UniqueConstraint is updated to become UniqueWhenNotNull index (nUniqueColumn is set to nKeyFields - 1 and uniquewhen not null is set to true). For other indexes uniqueWhenNotNull attribute is set to false. Hard upgrade is initiated from DataDictionaryImpl.upgradeConglomerate (new method). This method calles another newly intoroduced method getAllConglomerates to get the list of conglomerate and updates them by calling TransaCtioncontroller.updateUniquenessOfConglomerate which internally calls Conglomerate.updateUniqueness to conglomerate attributes. I will be posting one more patch after introducing multipage page scan while checking for duplicates.
          Hide
          Anurag Shekhar added a comment -

          This page include the code to scan multiple slot/page if the slot identified immediate left and right returns null. It continues to scan (lef in case privious and right in case of next record) unless a valid record is found in a slot or no more pages are left to check.

          Every time the scan moves to a new page the latch on the privious page is released (unless the previous page has the slot identified to insert the new record).

          All junit test suites are running without any failure.

          Show
          Anurag Shekhar added a comment - This page include the code to scan multiple slot/page if the slot identified immediate left and right returns null. It continues to scan (lef in case privious and right in case of next record) unless a valid record is found in a slot or no more pages are left to check. Every time the scan moves to a new page the latch on the privious page is released (unless the previous page has the slot identified to insert the new record). All junit test suites are running without any failure.
          Hide
          Anurag Shekhar added a comment -

          In this patch (derby-3330v8.diff) I have modified upgrade routine to
          upgrade only the backing indexes of unique constraint. Other
          indexes are left unchanged.

          DataDictionaryImpl.getAllConglomerates (introduced in derby-3330v6.dif) is
          no more required so I have taken it off.

          Tests are running without any error or failures.

          Show
          Anurag Shekhar added a comment - In this patch (derby-3330v8.diff) I have modified upgrade routine to upgrade only the backing indexes of unique constraint. Other indexes are left unchanged. DataDictionaryImpl.getAllConglomerates (introduced in derby-3330v6.dif) is no more required so I have taken it off. Tests are running without any error or failures.
          Hide
          Mike Matrigali added a comment -

          please update your implementation spec to include your plan for hard upgrade. Once this is done it should be easier to
          both review the plan and whether the code does what you have said is the plan.

          Show
          Mike Matrigali added a comment - please update your implementation spec to include your plan for hard upgrade. Once this is done it should be easier to both review the plan and whether the code does what you have said is the plan.
          Hide
          Mike Matrigali added a comment -

          comment on implementation details:
          While creating the unique constraint null checking is now conditional (only for the older version of Data Dictionary). In case of older version Data Dictionary the backing index created is an unique index.

          In order to not cause performance degredation for existing unique constraints I would have expected the constraint creation code to use the old unique index for non-null columns and use the new index for null columns. The unique
          index is going to perform somewhat better as it will have less checking to do (for instance on insert it will not have to
          check if nulls are in the key and do extra searching). Also it will make it likely that only new applications that use the new
          feature will see any new problems introduced by the new index. Existing applications will continue to use the old
          code.

          Show
          Mike Matrigali added a comment - comment on implementation details: While creating the unique constraint null checking is now conditional (only for the older version of Data Dictionary). In case of older version Data Dictionary the backing index created is an unique index. In order to not cause performance degredation for existing unique constraints I would have expected the constraint creation code to use the old unique index for non-null columns and use the new index for null columns. The unique index is going to perform somewhat better as it will have less checking to do (for instance on insert it will not have to check if nulls are in the key and do extra searching). Also it will make it likely that only new applications that use the new feature will see any new problems introduced by the new index. Existing applications will continue to use the old code.
          Hide
          Anurag Shekhar added a comment -

          Updated version Implementation Details.

          Added Hard Upgrade Section
          Updated to indicate that comparing with left and right record while inserting might result in traversing multiple pages.

          Show
          Anurag Shekhar added a comment - Updated version Implementation Details. Added Hard Upgrade Section Updated to indicate that comparing with left and right record while inserting might result in traversing multiple pages.
          Hide
          Anurag Shekhar added a comment -

          I have removed hard upgrade code from derby-3330v9.diff. Now the hard upgrade. For now it won;t be possible to make columns null able if they were set as not null while creating the constraint. I will be submitting another patch to allow that.
          This patch (derby-3330v9.diff) contains code do create unique constraint over null able columns, soft upgrade, and test case for unique constraint.

          Description of the patch can be found in the UniqueConstraint_Implementation_V2.html.

          Tests are running without any failures or error.

          Show
          Anurag Shekhar added a comment - I have removed hard upgrade code from derby-3330v9.diff. Now the hard upgrade. For now it won;t be possible to make columns null able if they were set as not null while creating the constraint. I will be submitting another patch to allow that. This patch (derby-3330v9.diff) contains code do create unique constraint over null able columns, soft upgrade, and test case for unique constraint. Description of the patch can be found in the UniqueConstraint_Implementation_V2.html. Tests are running without any failures or error.
          Hide
          Anurag Shekhar added a comment -

          updated hard upgrade section.

          Show
          Anurag Shekhar added a comment - updated hard upgrade section.
          Hide
          Mike Matrigali added a comment -

          do you know why the master for dml019 changed?
          Index: java/testing/org/apache/derbyTesting/functionTests/master/dml019.out
          ===================================================================
          — java/testing/org/apache/derbyTesting/functionTests/master/dml019.out
          (revision 630309)
          +++ java/testing/org/apache/derbyTesting/functionTests/master/dml019.out
          (working copy)
          @@ -82,18 +82,18 @@
          GROUP BY PNUM,EMPNUM,HOURS;
          EM&|PN&|HOURS
          --------------
          -E1 |P1 |40
          -E1 |P2 |20
          -E1 |P3 |80
          -E1 |P4 |20
          +E2 |P1 |40
          +E4 |P4 |40
          E1 |P5 |12
          +E4 |P5 |80
          E1 |P6 |12
          -E2 |P1 |40
          -E2 |P2 |80
          E3 |P2 |20
          +E1 |P4 |20
          +E1 |P1 |40
          E4 |P2 |20
          -E4 |P4 |40
          -E4 |P5 |80
          +E1 |P2 |20
          +E2 |P2 |80
          +E1 |P3 |80
          ij> – PASS:0077 If 12 rows are selected ?

          – END TEST >>> 0077 <<< END TEST
          @@ -105,18 +105,18 @@
          GROUP BY EMPNUM,PNUM,HOURS;
          PN&|EM&
          -------
          -P1 |E1
          -P2 |E1
          -P3 |E1
          -P4 |E1
          +P1 |E2
          +P4 |E4
          P5 |E1
          +P5 |E4
          P6 |E1
          -P1 |E2
          -P2 |E2
          P2 |E3
          +P4 |E1
          +P1 |E1
          P2 |E4
          -P4 |E4
          -P5 |E4
          +P2 |E1
          +P2 |E2
          +P3 |E1
          ij> – PASS:0078 If 12 rows are selected ?

          – END TEST >>> 0078 <<< END TEST

          Show
          Mike Matrigali added a comment - do you know why the master for dml019 changed? Index: java/testing/org/apache/derbyTesting/functionTests/master/dml019.out =================================================================== — java/testing/org/apache/derbyTesting/functionTests/master/dml019.out (revision 630309) +++ java/testing/org/apache/derbyTesting/functionTests/master/dml019.out (working copy) @@ -82,18 +82,18 @@ GROUP BY PNUM,EMPNUM,HOURS; EM&|PN&|HOURS -------------- -E1 |P1 |40 -E1 |P2 |20 -E1 |P3 |80 -E1 |P4 |20 +E2 |P1 |40 +E4 |P4 |40 E1 |P5 |12 +E4 |P5 |80 E1 |P6 |12 -E2 |P1 |40 -E2 |P2 |80 E3 |P2 |20 +E1 |P4 |20 +E1 |P1 |40 E4 |P2 |20 -E4 |P4 |40 -E4 |P5 |80 +E1 |P2 |20 +E2 |P2 |80 +E1 |P3 |80 ij> – PASS:0077 If 12 rows are selected ? – END TEST >>> 0077 <<< END TEST @@ -105,18 +105,18 @@ GROUP BY EMPNUM,PNUM,HOURS; PN&|EM& ------- -P1 |E1 -P2 |E1 -P3 |E1 -P4 |E1 +P1 |E2 +P4 |E4 P5 |E1 +P5 |E4 P6 |E1 -P1 |E2 -P2 |E2 P2 |E3 +P4 |E1 +P1 |E1 P2 |E4 -P4 |E4 -P5 |E4 +P2 |E1 +P2 |E2 +P3 |E1 ij> – PASS:0078 If 12 rows are selected ? – END TEST >>> 0078 <<< END TEST
          Hide
          Anurag Shekhar added a comment -

          dml019 test group by clause of unique constraint. When unique constraint
          was backed by unique index, distinct scan was used but after making it non
          unique constraint this was not the case so the results are not ordered.

          I have checked the test suite from nist web site and it mandates only
          number of rows and not their sequence.

          Show
          Anurag Shekhar added a comment - dml019 test group by clause of unique constraint. When unique constraint was backed by unique index, distinct scan was used but after making it non unique constraint this was not the case so the results are not ordered. I have checked the test suite from nist web site and it mandates only number of rows and not their sequence.
          Hide
          Mike Matrigali added a comment -

          I am running a set of tests on the v9 patch and will post results when they are done. I am reviewing the v9 patch currently, but will likely concentrate on the store level changes. If there is anyone with time to review the language level code that would be good. Especially would like help verifying that the language catalog upgrade code looks right - I have not done that before.

          Show
          Mike Matrigali added a comment - I am running a set of tests on the v9 patch and will post results when they are done. I am reviewing the v9 patch currently, but will likely concentrate on the store level changes. If there is anyone with time to review the language level code that would be good. Especially would like help verifying that the language catalog upgrade code looks right - I have not done that before.
          Hide
          Daniel John Debrunner added a comment -

          I think the upgrade handling of IndexDescriptorImpl is too complex.

          IndexDescriptorImpl uses a FormatableHashtable to store some of its state, the format of this can handle additional or missing keys. Thus the additional boolean required by this change can simply be added to the hash table, so the writeExternal has an additional:

          fh.putBoolean ("isUniqueWhenNotNull", isUniqueWhenNotNull);

          and the readExternal can have:

          if (fh.containsKey("isUniqueWhenNotNull")
          isUniqueWhenNotNull = fh.getBoolean("isUniqueWhenNotNull");
          else
          isUniqueWhenNotNull = ?? ; // what ever is the correct value for old indexes.

          Of course good to comment the real code with when & why these changes were made.

          Show
          Daniel John Debrunner added a comment - I think the upgrade handling of IndexDescriptorImpl is too complex. IndexDescriptorImpl uses a FormatableHashtable to store some of its state, the format of this can handle additional or missing keys. Thus the additional boolean required by this change can simply be added to the hash table, so the writeExternal has an additional: fh.putBoolean ("isUniqueWhenNotNull", isUniqueWhenNotNull); and the readExternal can have: if (fh.containsKey("isUniqueWhenNotNull") isUniqueWhenNotNull = fh.getBoolean("isUniqueWhenNotNull"); else isUniqueWhenNotNull = ?? ; // what ever is the correct value for old indexes. Of course good to comment the real code with when & why these changes were made.
          Hide
          Anurag Shekhar added a comment -

          Changes in this (derby-3330v10.diff) patch since (derby-3330v9.diff)

          Modified java/engine/org/apache/derby/impl/sql/compile/TableElementList.java to introduce a new method to check if any of the column in the constraint definition can have null value. This method is used while creating backing index for unique constraint and if all columns are non null able a backing unique index is created. If any of the column are null able a non unique index with uniqueWhenNotNull set to true.

          nist script dml019.out doesn't requires any change now.

          I haven't finished running tests on this patch.

          Show
          Anurag Shekhar added a comment - Changes in this (derby-3330v10.diff) patch since (derby-3330v9.diff) Modified java/engine/org/apache/derby/impl/sql/compile/TableElementList.java to introduce a new method to check if any of the column in the constraint definition can have null value. This method is used while creating backing index for unique constraint and if all columns are non null able a backing unique index is created. If any of the column are null able a non unique index with uniqueWhenNotNull set to true. nist script dml019.out doesn't requires any change now. I haven't finished running tests on this patch.
          Hide
          Dibyendu Majumdar added a comment -

          Hi,

          I could be talking nonsense here so please tell me to shut up if I have got the wrong end of the issue.

          If I understand correctly, one of the requirements is to allow multiple rows to be inserted where the index is unique and all columns are null. This doesn't work currently because such rows will be considered duplicates.

          Thinking very naively, I would implement this as follows:

          I am assuming that when index rows are stored, Derby always stores an additional column containing the row location, regardless of the type of index.
          Also that the search algorithm is uniform for both unique and non-unique indexes, with the only difference being that the unique index searches do not consider the extra row location column.
          Following is invalid if above assumptions are not true.

          To get the desired behaviour, I would simply change the comparison logic as follows:

          a) If non-unique index, no change.
          b) If unique index, and all indexable columns are null, then compare the extra (last) (row location) column.
          c) The else case for unique index will remain the same as now.

          If the key comparison routine always used above logic, would it not give desired behaviour without requiring any significant changes to the existing implementation?

          As I said, I may totally off track here, so please feel free to tell me to shut up.

          Show
          Dibyendu Majumdar added a comment - Hi, I could be talking nonsense here so please tell me to shut up if I have got the wrong end of the issue. If I understand correctly, one of the requirements is to allow multiple rows to be inserted where the index is unique and all columns are null. This doesn't work currently because such rows will be considered duplicates. Thinking very naively, I would implement this as follows: I am assuming that when index rows are stored, Derby always stores an additional column containing the row location, regardless of the type of index. Also that the search algorithm is uniform for both unique and non-unique indexes, with the only difference being that the unique index searches do not consider the extra row location column. Following is invalid if above assumptions are not true. To get the desired behaviour, I would simply change the comparison logic as follows: a) If non-unique index, no change. b) If unique index, and all indexable columns are null, then compare the extra (last) (row location) column. c) The else case for unique index will remain the same as now. If the key comparison routine always used above logic, would it not give desired behaviour without requiring any significant changes to the existing implementation? As I said, I may totally off track here, so please feel free to tell me to shut up.
          Hide
          Mike Matrigali added a comment -

          From running the v9 patch on ibm15 jvm against a windows XP laptop I got 1 failure in the junit tests and 6 failures in the old style tests - i will attach a copy of the results for the old style test failures. Are all tests passing in your environment with this patch? I will try to run another set of tests over night with your latest patch - but do post what environment you are running the tests on and if they all pass or not with the v10 patch.

          junit failure:
          There was 1 failure:
          1) dml019(org.apache.derbyTesting.functionTests.tests.nist.NistScripts)junit.fra
          mework.ComparisonFailure: Output at line 85 expected:<...1 |4...> but was:<...2

          8...>
          at org.apache.derbyTesting.functionTests.util.CanonTestCase.compareCanon(Can
          onTestCase.java:100)
          at org.apache.derbyTesting.functionTests.util.ScriptTestCase.runTest(ScriptT
          estCase.java:124)
          at org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:101)
          at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
          at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
          at junit.extensions.TestSetup.run(TestSetup.java:23)
          at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
          at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
          at junit.extensions.TestSetup.run(TestSetup.java:23)
          at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)
          at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
          at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
          at junit.extensions.TestSetup.run(TestSetup.java:23)
          at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
          at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
          at junit.extensions.TestSetup.run(TestSetup.java:23)
          at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)
          at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
          at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
          at junit.extensions.TestSetup.run(TestSetup.java:23)
          at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)
          at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
          at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
          at junit.extensions.TestSetup.run(TestSetup.java:23)
          at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)

          The following tests failed, will attach complete diff:
          derbyall/derbyall.fail:lang/altertable.sql
          derbyall/derbyall.fail:lang/db2Compatibility.sql
          derbyall/derbyall.fail:lang/subqueryFlattening.sql
          derbyall/derbyall.fail:tools/dblook_test.java
          derbyall/derbynetclientmats/derbynetmats.fail:derbynet/dblook_test_net.java
          derbyall/storeall/storeall.fail:store/rollForwardRecovery.sql

          Show
          Mike Matrigali added a comment - From running the v9 patch on ibm15 jvm against a windows XP laptop I got 1 failure in the junit tests and 6 failures in the old style tests - i will attach a copy of the results for the old style test failures. Are all tests passing in your environment with this patch? I will try to run another set of tests over night with your latest patch - but do post what environment you are running the tests on and if they all pass or not with the v10 patch. junit failure: There was 1 failure: 1) dml019(org.apache.derbyTesting.functionTests.tests.nist.NistScripts)junit.fra mework.ComparisonFailure: Output at line 85 expected:<...1 |4...> but was:<...2 8...> at org.apache.derbyTesting.functionTests.util.CanonTestCase.compareCanon(Can onTestCase.java:100) at org.apache.derbyTesting.functionTests.util.ScriptTestCase.runTest(ScriptT estCase.java:124) at org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:101) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22) at junit.extensions.TestSetup$1.protect(TestSetup.java:19) at junit.extensions.TestSetup.run(TestSetup.java:23) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22) at junit.extensions.TestSetup$1.protect(TestSetup.java:19) at junit.extensions.TestSetup.run(TestSetup.java:23) at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22) at junit.extensions.TestSetup$1.protect(TestSetup.java:19) at junit.extensions.TestSetup.run(TestSetup.java:23) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22) at junit.extensions.TestSetup$1.protect(TestSetup.java:19) at junit.extensions.TestSetup.run(TestSetup.java:23) at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22) at junit.extensions.TestSetup$1.protect(TestSetup.java:19) at junit.extensions.TestSetup.run(TestSetup.java:23) at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22) at junit.extensions.TestSetup$1.protect(TestSetup.java:19) at junit.extensions.TestSetup.run(TestSetup.java:23) at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57) The following tests failed, will attach complete diff: derbyall/derbyall.fail:lang/altertable.sql derbyall/derbyall.fail:lang/db2Compatibility.sql derbyall/derbyall.fail:lang/subqueryFlattening.sql derbyall/derbyall.fail:tools/dblook_test.java derbyall/derbynetclientmats/derbynetmats.fail:derbynet/dblook_test_net.java derbyall/storeall/storeall.fail:store/rollForwardRecovery.sql
          Hide
          Mike Matrigali added a comment -

          report of diffs on ibm15 jvm running on xp laptop using version 9 patch.

          Show
          Mike Matrigali added a comment - report of diffs on ibm15 jvm running on xp laptop using version 9 patch.
          Hide
          Daniel John Debrunner added a comment -

          Dibyendu wrote:

          > one of the requirements is to allow multiple rows to be inserted where the index is unique and all columns are null.

          but not the only requirement

          In a unique constraint these key columns are allowed:

          Smith, NULL
          Smith, NULL
          Smith, NULL

          A unique index (from existing CREATE INDEX) would only allow one row with Smith,NULL

          Show
          Daniel John Debrunner added a comment - Dibyendu wrote: > one of the requirements is to allow multiple rows to be inserted where the index is unique and all columns are null. but not the only requirement In a unique constraint these key columns are allowed: Smith, NULL Smith, NULL Smith, NULL A unique index (from existing CREATE INDEX) would only allow one row with Smith,NULL
          Hide
          Anurag Shekhar added a comment -

          Description of patch (derby-3330v11.diff)

          Now the backing index for unique constraint, if all
          participating columns are not null able, will be a unique index.
          For now its not possible to set columns as null able after
          creating unique constraint (DERBY-3456 addresses this).

          I have removed IndexDescriptorImpl_10_3 and handling
          additional attribute the way Dan has recommend.

          I am able to run junit tests without any failure.

          derbyall harness tests has two failure
          1. lang/db2Compatibility.sql
          I have attached a separate diff (db2Compatibility.diff) to update the out files.
          I am not sure if we should continue to call this test as db2Compatibility as this
          test is not ensuring compatibility anymore (able to create unique constraint
          over null able fields)

          2.tools/derbyrunjartest.java
          This failure was because of a missing class file in derbytools.jar
          (org.apache.derby.impl.tools.run). I am getting this problem in a
          fresh workspace too. So It might be something to do with my
          build environment and not the patch.

          I have modified following files in harness test

          1. org/apache/derbyTesting/functionTests/tests/lang/subqueryFlattening.sql
          2. org/apache/derbyTesting/functionTests/master/subqueryFlattening.out

          I have removed the additional test for unique constraint over null able field.
          This creation was supposed to fail and the output query tree was based on no
          additional constraint. After allowing unique constraint over null able field
          an additional index was introduced which changed the query tree. After removing
          the statement to create unique constraint over null able field I am getting
          same query tree as expected.

          3. org/apache/derbyTesting/functionTests/master/altertable.out

          Original out file was expecting the creation of unique constraint to
          fail over null able fields. Updated out file to expect it to succeed.

          Other failures in harness test were because of change in backing index (from
          unique to non unique) for unique constraint over not null columns. They disappeared
          after making changes related to backing index.

          Show
          Anurag Shekhar added a comment - Description of patch (derby-3330v11.diff) Now the backing index for unique constraint, if all participating columns are not null able, will be a unique index. For now its not possible to set columns as null able after creating unique constraint ( DERBY-3456 addresses this). I have removed IndexDescriptorImpl_10_3 and handling additional attribute the way Dan has recommend. I am able to run junit tests without any failure. derbyall harness tests has two failure 1. lang/db2Compatibility.sql I have attached a separate diff (db2Compatibility.diff) to update the out files. I am not sure if we should continue to call this test as db2Compatibility as this test is not ensuring compatibility anymore (able to create unique constraint over null able fields) 2.tools/derbyrunjartest.java This failure was because of a missing class file in derbytools.jar (org.apache.derby.impl.tools.run). I am getting this problem in a fresh workspace too. So It might be something to do with my build environment and not the patch. I have modified following files in harness test 1. org/apache/derbyTesting/functionTests/tests/lang/subqueryFlattening.sql 2. org/apache/derbyTesting/functionTests/master/subqueryFlattening.out I have removed the additional test for unique constraint over null able field. This creation was supposed to fail and the output query tree was based on no additional constraint. After allowing unique constraint over null able field an additional index was introduced which changed the query tree. After removing the statement to create unique constraint over null able field I am getting same query tree as expected. 3. org/apache/derbyTesting/functionTests/master/altertable.out Original out file was expecting the creation of unique constraint to fail over null able fields. Updated out file to expect it to succeed. Other failures in harness test were because of change in backing index (from unique to non unique) for unique constraint over not null columns. They disappeared after making changes related to backing index.
          Hide
          Anurag Shekhar added a comment -

          db2Compatibility.diff
          diff file to update db2Compatibility.out.

          Show
          Anurag Shekhar added a comment - db2Compatibility.diff diff file to update db2Compatibility.out.
          Hide
          Anurag Shekhar added a comment -

          derby-3330v12.diff

          fixed indentation issues and added comment in IndexDescriptorImpl.
          As the changes are minor (only indentation and comments) I haven't run
          all the test just upgrade and lang junit suites.

          Show
          Anurag Shekhar added a comment - derby-3330v12.diff fixed indentation issues and added comment in IndexDescriptorImpl. As the changes are minor (only indentation and comments) I haven't run all the test just upgrade and lang junit suites.
          Hide
          Daniel John Debrunner added a comment -

          Thanks for cleaning up the IndexDescriptor upgrade related code, makes it look much cleaner.

          A minor issue I have that if needed can be handled after any patch is applied is the naming in the code of the new style of unique index.

          The common theme is "uniqueWhenNotNull", when I first looked at the code I thought this was the other way around since for index definitions I tend to think of the DDL, not the row values. Thus:

          isUnique() is true for indexes where the columns are defined NOT NULL

          isUniqueWhenNotNull() is true for indexes where the columns are nullable (but unique when the column values do not contain NULLs).

          See how that can be confusing to code readers who see the code without having been immersed in developing the code.

          Not sure of a better name, re-writing to be a positive statement (always clearer) but still referring to the values would be:
          isUniqueWithDuplicateNulls()

          or a positive statement referring to the column types:
          isUniqueWithNullableColumns()

          I think a change in name would be good for the long term readability of the code, again it can be a follow on patch, fairly easy to do with IDE refactoring.

          One more aside, it would be good to expand the definition of isUnique() and the current isUniqueWhenNotNull() in IndexDescriptor to indicate any relationship between them, e.g. is isUnique() true if isUniqueWhenNotNull() is true?

          Show
          Daniel John Debrunner added a comment - Thanks for cleaning up the IndexDescriptor upgrade related code, makes it look much cleaner. A minor issue I have that if needed can be handled after any patch is applied is the naming in the code of the new style of unique index. The common theme is "uniqueWhenNotNull", when I first looked at the code I thought this was the other way around since for index definitions I tend to think of the DDL, not the row values. Thus: isUnique() is true for indexes where the columns are defined NOT NULL isUniqueWhenNotNull() is true for indexes where the columns are nullable (but unique when the column values do not contain NULLs). See how that can be confusing to code readers who see the code without having been immersed in developing the code. Not sure of a better name, re-writing to be a positive statement (always clearer) but still referring to the values would be: isUniqueWithDuplicateNulls() or a positive statement referring to the column types: isUniqueWithNullableColumns() I think a change in name would be good for the long term readability of the code, again it can be a follow on patch, fairly easy to do with IDE refactoring. One more aside, it would be good to expand the definition of isUnique() and the current isUniqueWhenNotNull() in IndexDescriptor to indicate any relationship between them, e.g. is isUnique() true if isUniqueWhenNotNull() is true?
          Hide
          Anurag Shekhar added a comment -

          UniqueWithDuplicateNulls explains the actual behavior than
          uniqueWhenNotNull. Thanks for suggesting it.
          I will replace uniqueWhenNotNull by this.
          In my next patch (or the follow up patch is there is no further issue in
          my latest patch) i will replace the index name and will explain the the
          relation between unique and UniqueWithDuplicateNulls.

          Show
          Anurag Shekhar added a comment - UniqueWithDuplicateNulls explains the actual behavior than uniqueWhenNotNull. Thanks for suggesting it. I will replace uniqueWhenNotNull by this. In my next patch (or the follow up patch is there is no further issue in my latest patch) i will replace the index name and will explain the the relation between unique and UniqueWithDuplicateNulls.
          Hide
          Anurag Shekhar added a comment -

          uniqueWhenNotNull attribute (in B2I and IndexDescritor) is ignored if isUnique is true. uniqueWhenNotNull is effective only if the the index is non unique type.
          I will add this explanation in comments.

          Show
          Anurag Shekhar added a comment - uniqueWhenNotNull attribute (in B2I and IndexDescritor) is ignored if isUnique is true. uniqueWhenNotNull is effective only if the the index is non unique type. I will add this explanation in comments.
          Hide
          Anurag Shekhar added a comment -

          please ignore derby-3330v12.diff . It had a conflict with one of the recent patches and test suite didn't get updated. I will be uploading a new patch.

          Show
          Anurag Shekhar added a comment - please ignore derby-3330v12.diff . It had a conflict with one of the recent patches and test suite didn't get updated. I will be uploading a new patch.
          Hide
          Anurag Shekhar added a comment -

          changes in derby-3330v13.diff

          1. replaced uniqueWhenNotNull by uniqueWithDuplicateNulls.
          2. added description about relation ship between unique and uniqueWithDuplicateNulls.
          3. Added entry for NullableUniqueConstraintTest in lang._Suite (this was lost due to one of the conflicts)

          Upgrade test (from 10.3.1.4) runs without failure.
          lang._Suite fails with 6 failures in LangScripts. I am getting these failures in fresh work space. (few hours back there was no failures in lang._Suite)

          Show
          Anurag Shekhar added a comment - changes in derby-3330v13.diff 1. replaced uniqueWhenNotNull by uniqueWithDuplicateNulls. 2. added description about relation ship between unique and uniqueWithDuplicateNulls. 3. Added entry for NullableUniqueConstraintTest in lang._Suite (this was lost due to one of the conflicts) Upgrade test (from 10.3.1.4) runs without failure. lang._Suite fails with 6 failures in LangScripts. I am getting these failures in fresh work space. (few hours back there was no failures in lang._Suite)
          Hide
          Anurag Shekhar added a comment -

          junit.all and derbyall tests are running fine with derby-3330v13.diff.

          derbyall has some failures also visible in tinderbox.
          One additional failure I am getting is tools/derbyrunjartest.java which fails with
          same error in my fresh work space too.

          db2Compatibility.diff is not valid anymore it has conflicting failures with some changes. I will
          create a new patch once lang/db2Compatibility.sql is clean in tinderbox.

          As of now there are no pending issues in derby-3330v13.diff

          Show
          Anurag Shekhar added a comment - junit.all and derbyall tests are running fine with derby-3330v13.diff. derbyall has some failures also visible in tinderbox. One additional failure I am getting is tools/derbyrunjartest.java which fails with same error in my fresh work space too. db2Compatibility.diff is not valid anymore it has conflicting failures with some changes. I will create a new patch once lang/db2Compatibility.sql is clean in tinderbox. As of now there are no pending issues in derby-3330v13.diff
          Hide
          Anurag Shekhar added a comment -

          description of db2Compatibility-v2.diff

          removed test case for unique constraint over null able columns and updated out files accordingly.

          Show
          Anurag Shekhar added a comment - description of db2Compatibility-v2.diff removed test case for unique constraint over null able columns and updated out files accordingly.
          Hide
          Mike Matrigali added a comment -

          I am looking at committing a version of the latest changes, once they pass a set of tests for me and my final review. If anyone has any reservations about this going in, let me know. Overall I think the changes now only affect only the new
          functionality so even if I find some non-test related problems I am likely to commit to make it easier to address the remaining issues.

          Show
          Mike Matrigali added a comment - I am looking at committing a version of the latest changes, once they pass a set of tests for me and my final review. If anyone has any reservations about this going in, let me know. Overall I think the changes now only affect only the new functionality so even if I find some non-test related problems I am likely to commit to make it easier to address the remaining issues.
          Hide
          Mike Matrigali added a comment -

          revision 633560 checked into trunk:

          DERBY-3330
          committed on behalf of Anurag Shekhar

          Committed modified versions of patch derby-3330v13.diff and
          db2Compatibility-v2.diff. The modifications are mostly changed and/or
          added comments along with some code formatting changes to make the
          new code match the surrounding code style.

          This checkin adds the functionality to
          create unique constraints on a key that contains one or more nullable
          columns. This constraint will all inserts of all keys that contain
          one or more columns with the null value. All keys that contain no columns
          with null values are constrained to be unique in the table.

          The underlying implementation for this new type of unique constraint uses
          the existing btree non-unique index with a small modification to do
          checking at insert time to provide the described unique behavior for null
          and non-null keys. The sorter has also been modified to provide this
          support for creating the index.

          Show
          Mike Matrigali added a comment - revision 633560 checked into trunk: DERBY-3330 committed on behalf of Anurag Shekhar Committed modified versions of patch derby-3330v13.diff and db2Compatibility-v2.diff. The modifications are mostly changed and/or added comments along with some code formatting changes to make the new code match the surrounding code style. This checkin adds the functionality to create unique constraints on a key that contains one or more nullable columns. This constraint will all inserts of all keys that contain one or more columns with the null value. All keys that contain no columns with null values are constrained to be unique in the table. The underlying implementation for this new type of unique constraint uses the existing btree non-unique index with a small modification to do checking at insert time to provide the described unique behavior for null and non-null keys. The sorter has also been modified to provide this support for creating the index.
          Hide
          Mike Matrigali added a comment -

          Here are some comments that were still unresolved in the submitted patch, but didn't seem big enough to hold up the patch and can be addressed with a subsequent patch.

          CreateIndexConstantAction.java -
          use of properties? Some of the existing code seems to sometimes expect
          a properties passed in. Your new code will overwrite the saved copy of
          the properties to pass needed info to the new sort routine. I don't
          know when a property may be passed in and lost. I think you just want
          to create a new variable for sort_properties.

          sort changes
          I don't like that the implementation of unique merge sort knows about
          row location last column. The intent of the sorted is to be somewhat
          general purpose so it would be better to somehow encode that info into
          the startup parameters. I think the null special handling is located
          in the right place, ie. in the new sort impl.

          A comment explaining the why of these 2 lines would be nice in
          UniqueWithDuplicateNullMergeSort.java would be nice.

          if (i == colsToCompare - 1 && nonull)
          return 0;

          BTreeController.java
          Can the second call to the following code be optimized, is it necessary?
          maybe a comment here would help.

          if (getConglomerate().isUniqueWithDuplicateNulls())

          { int ret = compareLeftAndRightSiblings(rowToInsert, insert_slot, targetleaf); if (ret == MATCH_FOUND) return ConglomerateController.ROWISDUPLICATE; if (ret == RESCAN_REQUIRED) continue; }

          B2I.java
          I updated the upgrade comments, especially for the change in behavior for
          the 10.3 formats. You should review these to make sure they are what
          you expect.

          IndexDescriptorImpl.java
          Should change following to match new naming:
          sb.append ("ALMOST UNIQUE");

          The following is a wrong (repeated same test), but I didn't change -
          wanted to give you a chance to look at it. Not sure what it affects, if
          it is a bug would be nice to have a test case that shows the problem.

          Wasn't sure if it wanted another different term, probably wants to be a
          test for the new attribute.

          if ((id.isUnique == this.isUnique) &&
          (id.isUnique == this.isUnique) &&

          BTreeController.java
          I think there may be some edge case stuff missing from the searches, I
          will look more later, but it looks like a good basis for the new
          functionality so rather see it get checked in and more eyes on it.
          Again now that new searches only happen on new nullable constraints there
          is less risk to existing functionality.

          I don't see upgrade tests, are they in a different JIRA/patch?

          Show
          Mike Matrigali added a comment - Here are some comments that were still unresolved in the submitted patch, but didn't seem big enough to hold up the patch and can be addressed with a subsequent patch. CreateIndexConstantAction.java - use of properties? Some of the existing code seems to sometimes expect a properties passed in. Your new code will overwrite the saved copy of the properties to pass needed info to the new sort routine. I don't know when a property may be passed in and lost. I think you just want to create a new variable for sort_properties. sort changes I don't like that the implementation of unique merge sort knows about row location last column. The intent of the sorted is to be somewhat general purpose so it would be better to somehow encode that info into the startup parameters. I think the null special handling is located in the right place, ie. in the new sort impl. A comment explaining the why of these 2 lines would be nice in UniqueWithDuplicateNullMergeSort.java would be nice. if (i == colsToCompare - 1 && nonull) return 0; BTreeController.java Can the second call to the following code be optimized, is it necessary? maybe a comment here would help. if (getConglomerate().isUniqueWithDuplicateNulls()) { int ret = compareLeftAndRightSiblings(rowToInsert, insert_slot, targetleaf); if (ret == MATCH_FOUND) return ConglomerateController.ROWISDUPLICATE; if (ret == RESCAN_REQUIRED) continue; } B2I.java I updated the upgrade comments, especially for the change in behavior for the 10.3 formats. You should review these to make sure they are what you expect. IndexDescriptorImpl.java Should change following to match new naming: sb.append ("ALMOST UNIQUE"); The following is a wrong (repeated same test), but I didn't change - wanted to give you a chance to look at it. Not sure what it affects, if it is a bug would be nice to have a test case that shows the problem. Wasn't sure if it wanted another different term, probably wants to be a test for the new attribute. if ((id.isUnique == this.isUnique) && (id.isUnique == this.isUnique) && BTreeController.java I think there may be some edge case stuff missing from the searches, I will look more later, but it looks like a good basis for the new functionality so rather see it get checked in and more eyes on it. Again now that new searches only happen on new nullable constraints there is less risk to existing functionality. I don't see upgrade tests, are they in a different JIRA/patch?
          Hide
          Anurag Shekhar added a comment -

          Thanks Mike for guiding me through this issue and for the commit.
          I will submit follow up patch to address your comments.

          Upgrade tests and also the tests related to indexes you had suggested are part of DERBY-3456.

          Show
          Anurag Shekhar added a comment - Thanks Mike for guiding me through this issue and for the commit. I will submit follow up patch to address your comments. Upgrade tests and also the tests related to indexes you had suggested are part of DERBY-3456 .
          Hide
          Anurag Shekhar added a comment -

          description of derby-3330-UpgradeTests.diff

          This patch has following tests

          tests for unique constraint
          1. Create a unique constraint in lower version of derby
          2. Test the above constraint in soft upgrade mode.
          3. create a new unique constraint in soft upgrade mode.: Shouldn't able to create
          unique constraint over nullable columns.
          4. test the above constraint in post soft upgrade mode (running under lower version)

          tests for index
          This test is for indexes in different modes. Different types are indexes are created
          during
          1. Create (running under older version)
          2. Soft Upgrade mode
          3. Post soft upgrade mode
          4. Hard upgrade mode

          In all these phases the indexes created in previous phases are tested.

          The types of indexes created in each phases are
          1. Unique index over not null field
          2. Unique Index over null able field
          3. Non Unique constraint over not null field
          4. Non Unique constraint over null able field.

          Show
          Anurag Shekhar added a comment - description of derby-3330-UpgradeTests.diff This patch has following tests tests for unique constraint 1. Create a unique constraint in lower version of derby 2. Test the above constraint in soft upgrade mode. 3. create a new unique constraint in soft upgrade mode.: Shouldn't able to create unique constraint over nullable columns. 4. test the above constraint in post soft upgrade mode (running under lower version) tests for index This test is for indexes in different modes. Different types are indexes are created during 1. Create (running under older version) 2. Soft Upgrade mode 3. Post soft upgrade mode 4. Hard upgrade mode In all these phases the indexes created in previous phases are tested. The types of indexes created in each phases are 1. Unique index over not null field 2. Unique Index over null able field 3. Non Unique constraint over not null field 4. Non Unique constraint over null able field.
          Hide
          Anurag Shekhar added a comment -

          CreateIndexConstantAction.java -
          use of properties? Some of the existing code seems to sometimes expect
          a properties passed in. Your new code will overwrite the saved copy of
          the properties to pass needed info to the new sort routine. I don't
          know when a property may be passed in and lost. I think you just want
          to create a new variable for sort_properties.

          This piece of code is unnecessary. I had added it initially for some debugging
          purpose and forgot to remove. Sorry about it. I will take it off.

          sort changes
          I don't like that the implementation of unique merge sort knows about
          row location last column. The intent of the sorted is to be somewhat
          general purpose so it would be better to somehow encode that info into
          the startup parameters. I think the null special handling is located
          in the right place, ie. in the new sort impl.

          A comment explaining the why of these 2 lines would be nice in
          UniqueWithDuplicateNullMergeSort.java would be nice.

          if (i == colsToCompare - 1 && nonull)
          return 0;

          I will add comment in my follow up patch and will also check how can I make sort
          independent of information about location.

          BTreeController.java
          Can the second call to the following code be optimized, is it necessary?
          maybe a comment here would help.

          if (getConglomerate().isUniqueWithDuplicateNulls())

          { int ret = compareLeftAndRightSiblings(rowToInsert, insert_slot, targetleaf); if (ret == MATCH_FOUND) return ConglomerateController.ROWISDUPLICATE; if (ret == RESCAN_REQUIRED) continue; }

          It shouldn't return from this code but should set
          return value to ConglomerateController.ROWISDUPLICATE, so that necessary cleanup of latch
          is performed.

          B2I.java
          I updated the upgrade comments, especially for the change in behavior for
          the 10.3 formats. You should review these to make sure they are what
          you expect.

          It looks good. Its exact description of what will happen during soft
          and hard upgrade. Thanks a lot for adding this.

          IndexDescriptorImpl.java
          Should change following to match new naming:
          sb.append ("ALMOST UNIQUE");

          I will change this.

          The following is a wrong (repeated same test), but I didn't change -
          wanted to give you a chance to look at it. Not sure what it affects, if
          it is a bug would be nice to have a test case that shows the problem.

          Wasn't sure if it wanted another different term, probably wants to be a
          test for the new attribute.

          if ((id.isUnique == this.isUnique) &&
          (id.isUnique == this.isUnique) &&

          This should be checking for uniqueWithDuplicateNull. I will update it.

          BTreeController.java
          I think there may be some edge case stuff missing from the searches, I
          will look more later, but it looks like a good basis for the new
          functionality so rather see it get checked in and more eyes on it.
          Again now that new searches only happen on new nullable constraints there
          is less risk to existing functionality.

          I will go though the code and try to identify the problem areas.

          I don't see upgrade tests, are they in a different JIRA/patch?

          I have attached tests in derby-3330-UpgradeTests.diff

          I am working on the followup patch and will upload it asap.

          Show
          Anurag Shekhar added a comment - CreateIndexConstantAction.java - use of properties? Some of the existing code seems to sometimes expect a properties passed in. Your new code will overwrite the saved copy of the properties to pass needed info to the new sort routine. I don't know when a property may be passed in and lost. I think you just want to create a new variable for sort_properties. This piece of code is unnecessary. I had added it initially for some debugging purpose and forgot to remove. Sorry about it. I will take it off. sort changes I don't like that the implementation of unique merge sort knows about row location last column. The intent of the sorted is to be somewhat general purpose so it would be better to somehow encode that info into the startup parameters. I think the null special handling is located in the right place, ie. in the new sort impl. A comment explaining the why of these 2 lines would be nice in UniqueWithDuplicateNullMergeSort.java would be nice. if (i == colsToCompare - 1 && nonull) return 0; I will add comment in my follow up patch and will also check how can I make sort independent of information about location. BTreeController.java Can the second call to the following code be optimized, is it necessary? maybe a comment here would help. if (getConglomerate().isUniqueWithDuplicateNulls()) { int ret = compareLeftAndRightSiblings(rowToInsert, insert_slot, targetleaf); if (ret == MATCH_FOUND) return ConglomerateController.ROWISDUPLICATE; if (ret == RESCAN_REQUIRED) continue; } It shouldn't return from this code but should set return value to ConglomerateController.ROWISDUPLICATE, so that necessary cleanup of latch is performed. B2I.java I updated the upgrade comments, especially for the change in behavior for the 10.3 formats. You should review these to make sure they are what you expect. It looks good. Its exact description of what will happen during soft and hard upgrade. Thanks a lot for adding this. IndexDescriptorImpl.java Should change following to match new naming: sb.append ("ALMOST UNIQUE"); I will change this. The following is a wrong (repeated same test), but I didn't change - wanted to give you a chance to look at it. Not sure what it affects, if it is a bug would be nice to have a test case that shows the problem. Wasn't sure if it wanted another different term, probably wants to be a test for the new attribute. if ((id.isUnique == this.isUnique) && (id.isUnique == this.isUnique) && This should be checking for uniqueWithDuplicateNull. I will update it. BTreeController.java I think there may be some edge case stuff missing from the searches, I will look more later, but it looks like a good basis for the new functionality so rather see it get checked in and more eyes on it. Again now that new searches only happen on new nullable constraints there is less risk to existing functionality. I will go though the code and try to identify the problem areas. I don't see upgrade tests, are they in a different JIRA/patch? I have attached tests in derby-3330-UpgradeTests.diff I am working on the followup patch and will upload it asap.
          Hide
          Anurag Shekhar added a comment -

          Issues addressed in derby-3330_followup_1.diff

          CreateIndexConstantAction.java

          I have changed the name of the variable and added comment to explain what its meant for.
          I got confused while commenting last time. This property is required if
          TransactionCoordinator has to select the non default Sorter.

          sort changes

          I have updated compare method and it doesn't need to assume location column. But it still
          has information about nulls not being equal.
          I am checking how to remove this information from comparison. It appears to me that I can remove
          this custom compare method and use the existing but that miy need some change which might effect
          other part of code.I will wait for 10.4 branching and will work for it in the trunk.

          BTreeController.java

          I haven't optimized it yet. But instead of returning error code setting the ret value so the latch
          is cleared before return.

          IndexDescriptorImpl.java

          Updated both the place (using new name in toString () and using the new attribute in equals methods).

          BTreeController.java
          I am still checking it out and will submit new follow up if I find any problem areas.

          junit suites.All is running without any failure with this patch.

          Show
          Anurag Shekhar added a comment - Issues addressed in derby-3330_followup_1.diff CreateIndexConstantAction.java I have changed the name of the variable and added comment to explain what its meant for. I got confused while commenting last time. This property is required if TransactionCoordinator has to select the non default Sorter. sort changes I have updated compare method and it doesn't need to assume location column. But it still has information about nulls not being equal. I am checking how to remove this information from comparison. It appears to me that I can remove this custom compare method and use the existing but that miy need some change which might effect other part of code.I will wait for 10.4 branching and will work for it in the trunk. BTreeController.java I haven't optimized it yet. But instead of returning error code setting the ret value so the latch is cleared before return. IndexDescriptorImpl.java Updated both the place (using new name in toString () and using the new attribute in equals methods). BTreeController.java I am still checking it out and will submit new follow up if I find any problem areas. junit suites.All is running without any failure with this patch.
          Hide
          Anurag Shekhar added a comment -

          Updated implementation doc
          Added a section about shared backing index.
          Added a section about making column nullable

          Show
          Anurag Shekhar added a comment - Updated implementation doc Added a section about shared backing index. Added a section about making column nullable
          Hide
          Mike Matrigali added a comment -

          I am currently looking at the new patches in this issue and will commit as appropriate after review and running tests.

          Show
          Mike Matrigali added a comment - I am currently looking at the new patches in this issue and will commit as appropriate after review and running tests.
          Hide
          Mike Matrigali added a comment -

          In the derby-3330_followup_1.diff I don't think the sort changes will work correctly. In the
          uniqueWithDuplicateNulls we need the sorter to sort on the rowlocation column also in the
          case where there are 2 rows that otherwise are duplicate by store null comparison standards.
          Imagine the case of a table with a single column nullable index that has a million rows all
          with null. Order does not really matter for user, but when the base row is deleted the system
          will want to exactly find the matching (null, (page 5, row7)) index entry and if the rows are not
          properly sorted on the row location then the btree search for this row will probably fail.

          So what we want from sorter if N is the number of columns including the row location column:
          1) sort on N columns
          2) if any column has a null don't do any duplicate checking
          3) if no column has a null then duplicate checking based on leading N-1 columns

          It would nice to have a test that verified the sorting is correct when there are many duplicate nulls. It is a little
          tricky to get a good test case as the normal case is for rows scanned from a heap to build an index to have
          rowlocations in ascending order - but we should not be counting on that. I will have to think about this. I don't
          know if once you get the sorter to go external with multiple merge runs I think it will shuffle the rows from input
          order based on what sort keys you told it to consider. We should verify that the existing checked in code handles
          this case correctly.

          Show
          Mike Matrigali added a comment - In the derby-3330_followup_1.diff I don't think the sort changes will work correctly. In the uniqueWithDuplicateNulls we need the sorter to sort on the rowlocation column also in the case where there are 2 rows that otherwise are duplicate by store null comparison standards. Imagine the case of a table with a single column nullable index that has a million rows all with null. Order does not really matter for user, but when the base row is deleted the system will want to exactly find the matching (null, (page 5, row7)) index entry and if the rows are not properly sorted on the row location then the btree search for this row will probably fail. So what we want from sorter if N is the number of columns including the row location column: 1) sort on N columns 2) if any column has a null don't do any duplicate checking 3) if no column has a null then duplicate checking based on leading N-1 columns It would nice to have a test that verified the sorting is correct when there are many duplicate nulls. It is a little tricky to get a good test case as the normal case is for rows scanned from a heap to build an index to have rowlocations in ascending order - but we should not be counting on that. I will have to think about this. I don't know if once you get the sorter to go external with multiple merge runs I think it will shuffle the rows from input order based on what sort keys you told it to consider. We should verify that the existing checked in code handles this case correctly.
          Hide
          Mike Matrigali added a comment -

          looking at it more carefully, would it be possible to not have any sorter changes and handle the required functionality in the
          new UniqueWithDuplicateNullsIndexSortObserver ? Basically use the existing sorter to sort on all columns and then in the observer look at each row and if it has a null let it through. If it doesn't have a null compare it with the next row and throw an error if it has a duplicate violation? I know there is some existing code for doing this kind of compare - see
          CardinalityCounter.java.

          Show
          Mike Matrigali added a comment - looking at it more carefully, would it be possible to not have any sorter changes and handle the required functionality in the new UniqueWithDuplicateNullsIndexSortObserver ? Basically use the existing sorter to sort on all columns and then in the observer look at each row and if it has a null let it through. If it doesn't have a null compare it with the next row and throw an error if it has a duplicate violation? I know there is some existing code for doing this kind of compare - see CardinalityCounter.java.
          Hide
          Anurag Shekhar added a comment -

          Thanks Mike for the review.

          Looks like the earlier (before the follow up changes) sorter routine was doing right thing.
          if had same compare routine from UniqueSort with additional entry for non nulls at the beginning of comparison loop

          if (i == colsToCompare - 1 && nonull)
          return 0;

          So if the comparison has reached till the last entry (location) and there is no null found so far it will consider the rows as duplicate. If one of the part was null (nonull will be false) the it will go ahead and compare the location too.

          I will revert this change from the follow up patch and update sorting with comments in my new follow up patch.

          I will also try to come up with a good tests for scenario you explained.

          Show
          Anurag Shekhar added a comment - Thanks Mike for the review. Looks like the earlier (before the follow up changes) sorter routine was doing right thing. if had same compare routine from UniqueSort with additional entry for non nulls at the beginning of comparison loop if (i == colsToCompare - 1 && nonull) return 0; So if the comparison has reached till the last entry (location) and there is no null found so far it will consider the rows as duplicate. If one of the part was null (nonull will be false) the it will go ahead and compare the location too. I will revert this change from the follow up patch and update sorting with comments in my new follow up patch. I will also try to come up with a good tests for scenario you explained.
          Hide
          Anurag Shekhar added a comment -

          looking at it more carefully, would it be possible to not have any sorter changes and handle the required functionality in the
          new UniqueWithDuplicateNullsIndexSortObserver ?

          I think it can be done. I started to do this initially but I was getting some assertion failures at a different location (while constructing the BTree). I am sure it would have been a small issue to fix but I felt little nervous about changing something which might effect some other functionality. I plan to work on it but after the 10.4 branch is separated from trunk.

          Show
          Anurag Shekhar added a comment - looking at it more carefully, would it be possible to not have any sorter changes and handle the required functionality in the new UniqueWithDuplicateNullsIndexSortObserver ? I think it can be done. I started to do this initially but I was getting some assertion failures at a different location (while constructing the BTree). I am sure it would have been a small issue to fix but I felt little nervous about changing something which might effect some other functionality. I plan to work on it but after the 10.4 branch is separated from trunk.
          Hide
          Mike Matrigali added a comment -

          Here is the modified followup patch that I am looking at committing if it passes tests - tests running now. It has all your followup changes except the sorter changes. It has some spelling/comment fixes also.
          If no other problems are found with existing sorter implementation I am ok leaving it for now the way it is, if there are new problems I think the sort observer solution is cleaner for the future.

          If you have some added comments for the sorter file just submit them as a separate patch, and I can commit those easily without worrying about test running.

          Show
          Mike Matrigali added a comment - Here is the modified followup patch that I am looking at committing if it passes tests - tests running now. It has all your followup changes except the sorter changes. It has some spelling/comment fixes also. If no other problems are found with existing sorter implementation I am ok leaving it for now the way it is, if there are new problems I think the sort observer solution is cleaner for the future. If you have some added comments for the sorter file just submit them as a separate patch, and I can commit those easily without worrying about test running.
          Hide
          Anurag Shekhar added a comment -

          Added comments in UniqueWithDuplicateNullsMergeSort.java

          Show
          Anurag Shekhar added a comment - Added comments in UniqueWithDuplicateNullsMergeSort.java
          Hide
          Mike Matrigali added a comment -

          i have committed a modified version of the sortercomments.diff patch.

          Show
          Mike Matrigali added a comment - i have committed a modified version of the sortercomments.diff patch.
          Hide
          Mike Matrigali added a comment -

          I have committed the derby-3330_followup_1_modified.diff patch to the trunk. I believe this is the last outstanding patch for this JIRA, so unchecking patch available.

          Show
          Mike Matrigali added a comment - I have committed the derby-3330_followup_1_modified.diff patch to the trunk. I believe this is the last outstanding patch for this JIRA, so unchecking patch available.
          Hide
          Bryan Pendleton added a comment -

          I came across this interesting discussion of the behavior of NULLs and UNIQUE constraints
          in various database systems, and thought it might be interesting in the context of this
          discussion: http://www.sqlite.org/nulls.html

          Show
          Bryan Pendleton added a comment - I came across this interesting discussion of the behavior of NULLs and UNIQUE constraints in various database systems, and thought it might be interesting in the context of this discussion: http://www.sqlite.org/nulls.html
          Hide
          Kristian Waagan added a comment -

          Does anyone know the status of this issue and it sub issues?
          Has the work been completed, so that the Jiras can be resolved/closed? (update fix version as well)

          Show
          Kristian Waagan added a comment - Does anyone know the status of this issue and it sub issues? Has the work been completed, so that the Jiras can be resolved/closed? (update fix version as well)
          Hide
          Andras Soltesz added a comment -

          Does anyone know when this and DERBY-2212 make it to a release?

          This is a very important part of functionality and it is really bad that it is not Oracle/PostgreSQL compliant, since Derby is often used as a development database or single-user production database in applications which use Oracle/PostgreSQL for multi-user production databases. With a behavioral difference like this, Derby cannot be used in this scenario since those apps usually complex enough to use this db feature.

          I just bumped into it when my application became complex enough to use unique indexes on NULLable columns. I have used the oracle-like unique index behavior since ages on Oracle and PG and now I don't know how to support it on Derby. Obviously, it can be replaced only with complex application logic.

          Show
          Andras Soltesz added a comment - Does anyone know when this and DERBY-2212 make it to a release? This is a very important part of functionality and it is really bad that it is not Oracle/PostgreSQL compliant, since Derby is often used as a development database or single-user production database in applications which use Oracle/PostgreSQL for multi-user production databases. With a behavioral difference like this, Derby cannot be used in this scenario since those apps usually complex enough to use this db feature. I just bumped into it when my application became complex enough to use unique indexes on NULLable columns. I have used the oracle-like unique index behavior since ages on Oracle and PG and now I don't know how to support it on Derby. Obviously, it can be replaced only with complex application logic.
          Hide
          Kristian Waagan added a comment -

          Have you tried the 10.4.2.0 release?
          The functionality made it into 10.4, see release notes http://db.apache.org/derby/releases/release-10.4.1.3.html - "New Features".

          However, I think the relevant issues in Jira are a bit out of sync with the reality, which is why I raised the issue of updating them.

          Show
          Kristian Waagan added a comment - Have you tried the 10.4.2.0 release? The functionality made it into 10.4, see release notes http://db.apache.org/derby/releases/release-10.4.1.3.html - "New Features". However, I think the relevant issues in Jira are a bit out of sync with the reality, which is why I raised the issue of updating them.
          Hide
          Mike Matrigali added a comment -

          I committed a bunch of this work. I believe this can be resolved/closed with all the work checked into 10.4.

          Show
          Mike Matrigali added a comment - I committed a bunch of this work. I believe this can be resolved/closed with all the work checked into 10.4.
          Hide
          Myrna van Lunteren added a comment -

          As the functionality made it into 10.4.1.3, marking as fixed in 10.4.1.3.

          Show
          Myrna van Lunteren added a comment - As the functionality made it into 10.4.1.3, marking as fixed in 10.4.1.3.
          Hide
          ASF subversion and git services added a comment -

          Commit 1545394 from Dag H. Wanvik in branch 'code/trunk'
          [ https://svn.apache.org/r1545394 ]

          DERBY-532 Support deferrable constraints

          Patch derby-532-post-scan-4 implements basic support for deferred
          constraints for PRIMARY KEY and UNIQUE constraints. Deferrable
          constraints are not enabled by default yet; one needs to set a
          property to try the feature: "derby.constraintsTesting".

          This patch enables deferred constraints for:

          a) primary key constraints
          b) unique constraint with not nullable columns
          c) unique constraint with nullable columns

          by new logic in insertion and sorts.

          The patch includes relaxing the constraint at insertion and update
          time, as well as adding a constraint to an existing table.

          Derby treats constraints a) and b) the same, and in the code these are
          marked as "unique" when they are not deferrable (as in existing code).

          Constraint type c) is currently marked as
          "uniqueWithDuplicateNulls". Insert/update of these is implemented in
          the BTree by including the RowLocation of the base row in the set of
          keys in the index row (DERBY-3330). This makes them trivially unique,
          but there is an extra code path in BTreeController that checks
          neighbor rows for duplicates, and only allows insertion if the key
          contains a null. When adding a constraint to an existing table, these
          are handled by a specially crafted sorter
          (UniqueWithDuplicateNullsMergeSort).

          The implementation of insert/update of deferrable indexes is based on
          a similar approach, i.e. by backing with a non-unique index, and checking
          duplicates in the language layer, notably IndexChanger.

          In IndexChanger, after inserting a row, we check if it is unique by
          performing a scan of the BTree. A time-out here leads to a pessimistic
          assumption that there is a duplicate. Duplicate key values are saved
          until checking time (usually commit time), when a new scan is
          performed to validate the uniqueness property.

          [This means a) and b) if deferrable are no longer marked "unique"].

          Deferrable indexes are not shared.

          If there are duplicates and we have deferred constraint mode (a
          dynamic session property), we save the duplicate index row in a disk
          based hash table (DeferredDuplicates#rememberDuplicate).

          For a) and b), constraints which are deferrable are marked as
          "uniqueDeferrable" and "hasDeferrableChecking". Constraints of type c)
          which are deferrable are marked "uniqueWithDuplicateNulls" and
          "hasDeferrableChecking". These marks determines the code paths
          used. Note that existing indexes and non-deferrable constraint do not
          get a new code path, which should preserve correctness and performance
          of those.

          Now, with these markers in place, deferral of checks happens in three
          places:

          {{ IndexChanger#insertAndCheckDups}}

          {{CreateIndexConstantAction#executeConstantAction +
          MergeSort#compare and UniqueWithDuplicateNullsMergeSort#compare }}

          InsertResultSet#setUpAllSorts

          The former is active for deferral under INSERT and UPDATE. The middle
          when adding a deferrable constraint to an existing table, when we sort
          existing rows detecting any duplicates. The last is used when importing
          rows.

          At transaction commit (1), or when the constraint mode for a deferred
          constraint is changed back to immediate (2), we validate the
          constraint (DeferredDuplicates#validate) by replaying the hash table
          and scanning the index for the duplicate index rows to ascertain there
          are none, or else we have an error: transaction or statement severity
          respectively for (1) and (2).

          The constraint mode is a SQL session level variable, and inside
          routines (nested connections), we push this on the stack. This means
          change of the constraint mode inside nested connections will be popped
          on routine exit. If, as part of this, a constraint changes from
          deferred to immediate mode, we also validate it for correctness. If
          this fail, the transaction rolls back
          We needed to do this from a newly introduced method,
          GenericLanguageConnectionContext#popNestedSessionContext. This
          pops the SQL session context.
          That hook is called from GenericPreparedStatement#executeStmt. As a
          part of this effort, we also renamed #setupNestedSessionContext to
          #pushNestedSessionContext.

          The patch also adds support for checking deferred constraints in
          xa_prepare and xa_commit (.., true), cf. specification attached to the
          JIRA issue.

          Concurrency: if a transaction gets a lock time-out when trying to
          establish if a row just inserted is a duplicate (another transaction
          may have just inserted a row with a similar index key), we use a
          pessimistics assumption and add that key to the set of keys to be
          checked at commit time. If a key can't be grabbed then, a time-out is
          thrown. We plan to add an optimized scan to avoid waiting for the lock
          at insertion time, cf DERBY-6419.

          The "not enforced" feature is not yet implemented in this patch.

          Several new test cases been added to ConstraintCharacteristicsTest to
          test these basic behaviors.

          Show
          ASF subversion and git services added a comment - Commit 1545394 from Dag H. Wanvik in branch 'code/trunk' [ https://svn.apache.org/r1545394 ] DERBY-532 Support deferrable constraints Patch derby-532-post-scan-4 implements basic support for deferred constraints for PRIMARY KEY and UNIQUE constraints. Deferrable constraints are not enabled by default yet; one needs to set a property to try the feature: "derby.constraintsTesting". This patch enables deferred constraints for: a) primary key constraints b) unique constraint with not nullable columns c) unique constraint with nullable columns by new logic in insertion and sorts. The patch includes relaxing the constraint at insertion and update time, as well as adding a constraint to an existing table. Derby treats constraints a) and b) the same, and in the code these are marked as "unique" when they are not deferrable (as in existing code). Constraint type c) is currently marked as "uniqueWithDuplicateNulls". Insert/update of these is implemented in the BTree by including the RowLocation of the base row in the set of keys in the index row ( DERBY-3330 ). This makes them trivially unique, but there is an extra code path in BTreeController that checks neighbor rows for duplicates, and only allows insertion if the key contains a null. When adding a constraint to an existing table, these are handled by a specially crafted sorter (UniqueWithDuplicateNullsMergeSort). The implementation of insert/update of deferrable indexes is based on a similar approach, i.e. by backing with a non-unique index, and checking duplicates in the language layer, notably IndexChanger. In IndexChanger, after inserting a row, we check if it is unique by performing a scan of the BTree. A time-out here leads to a pessimistic assumption that there is a duplicate. Duplicate key values are saved until checking time (usually commit time), when a new scan is performed to validate the uniqueness property. [This means a) and b) if deferrable are no longer marked "unique"] . Deferrable indexes are not shared. If there are duplicates and we have deferred constraint mode (a dynamic session property), we save the duplicate index row in a disk based hash table (DeferredDuplicates#rememberDuplicate). For a) and b), constraints which are deferrable are marked as "uniqueDeferrable" and "hasDeferrableChecking". Constraints of type c) which are deferrable are marked "uniqueWithDuplicateNulls" and "hasDeferrableChecking". These marks determines the code paths used. Note that existing indexes and non-deferrable constraint do not get a new code path, which should preserve correctness and performance of those. Now, with these markers in place, deferral of checks happens in three places: {{ IndexChanger#insertAndCheckDups}} {{CreateIndexConstantAction#executeConstantAction + MergeSort#compare and UniqueWithDuplicateNullsMergeSort#compare }} InsertResultSet#setUpAllSorts The former is active for deferral under INSERT and UPDATE. The middle when adding a deferrable constraint to an existing table, when we sort existing rows detecting any duplicates. The last is used when importing rows. At transaction commit (1), or when the constraint mode for a deferred constraint is changed back to immediate (2), we validate the constraint (DeferredDuplicates#validate) by replaying the hash table and scanning the index for the duplicate index rows to ascertain there are none, or else we have an error: transaction or statement severity respectively for (1) and (2). The constraint mode is a SQL session level variable, and inside routines (nested connections), we push this on the stack. This means change of the constraint mode inside nested connections will be popped on routine exit. If, as part of this, a constraint changes from deferred to immediate mode, we also validate it for correctness. If this fail, the transaction rolls back We needed to do this from a newly introduced method, GenericLanguageConnectionContext#popNestedSessionContext. This pops the SQL session context. That hook is called from GenericPreparedStatement#executeStmt. As a part of this effort, we also renamed #setupNestedSessionContext to #pushNestedSessionContext. The patch also adds support for checking deferred constraints in xa_prepare and xa_commit (.., true), cf. specification attached to the JIRA issue. Concurrency: if a transaction gets a lock time-out when trying to establish if a row just inserted is a duplicate (another transaction may have just inserted a row with a similar index key), we use a pessimistics assumption and add that key to the set of keys to be checked at commit time. If a key can't be grabbed then, a time-out is thrown. We plan to add an optimized scan to avoid waiting for the lock at insertion time, cf DERBY-6419 . The "not enforced" feature is not yet implemented in this patch. Several new test cases been added to ConstraintCharacteristicsTest to test these basic behaviors.

            People

            • Assignee:
              Anurag Shekhar
              Reporter:
              Anurag Shekhar
            • Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development