Derby
  1. Derby
  2. DERBY-2731

String literal constants currently take the collation of the compilation schema but the wiki page http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478 expects USER schema collation.

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 10.3.1.4
    • Fix Version/s: 10.3.1.4
    • Component/s: SQL
    • Labels:
      None

      Description

      I checked in code some time back which sets the collation type of string literal to be same as the compilation schema. The advantage of this is that metadata queries will work without changes since those queries do character string literal comparisons.

      But the wiki page at http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478 in Section Collation Determination, Rule 1) says that character string literal should always take the collation of user schema. This decision was based on the discussion in the Collation feature discussion thread at http://www.nabble.com/Collation-feature-discussion-tf3418026.html#a9675967. SQL spec defines the behavior here to be implementation defined (it says that in a convoluted way which can be found in the Collation feature discussion). But considering the impact it will have on the metadata queries (they will have to be changed so that we CAST character string literals so that they will take the collation of system schema and hence the comparison will not fail), should we reconsider our decision made on the wiki page.

        Issue Links

          Activity

          Hide
          Mike Matrigali added a comment -

          I believe that character string literals should take the collation of the current compile schema, that seems to follow the spec closer and be consistent with other settings. So I think current implmentation is ok and wiki spec should be updated to reflect this behavior. And doc will need to change if this has been incorporated anywhere into Derby docs already.

          Show
          Mike Matrigali added a comment - I believe that character string literals should take the collation of the current compile schema, that seems to follow the spec closer and be consistent with other settings. So I think current implmentation is ok and wiki spec should be updated to reflect this behavior. And doc will need to change if this has been incorporated anywhere into Derby docs already.
          Hide
          Daniel John Debrunner added a comment -

          Looking at other databases I see

          Microsoft SQL Server - string literals take on default collation of database
          MySQL - string literals take on default collation of connection
          Postgres - only supports single collation per database (????)

          Any others? Oracle? (I couldn't find mention of collation in Oracle 10g's character set section)

          One issue with the per schema approach is statement caching. Currently statement caching at a per-schema approach so there's no problem.
          However for statements that don't depend on the current schema it would be good to cache them across schemas, e.g.

          SELECT * FROM A.T

          Especially when the default schema for a user is specific to that user.
          With string literals taking information from the current schema, now a statement like:

          SELECT * FROM A.T WHERE TYPE = 'CAR'

          will be dependent on the current schema, thus not shareable (due to the collation for 'CAR' requring a lookup of the current schema)

          In fact thinking about it, it does look strange that such a statement is dependent on the collation of the current schema, I'm not sure that's what an application developer will be expecting when they write a statement like that (principle of least surprise).

          I'm not sure what's right here, just trying to expand the discussion so all angles have been looked at.

          In some ways it would seem useful in TYPE = 'CAR' for 'CAR' to take on the collation of TYPE, but it's clear in the SQL standard that both have implict collation derivation (even though we can't work out what the collationtype of 'CAR' is defined to be).

          Show
          Daniel John Debrunner added a comment - Looking at other databases I see Microsoft SQL Server - string literals take on default collation of database MySQL - string literals take on default collation of connection Postgres - only supports single collation per database (????) Any others? Oracle? (I couldn't find mention of collation in Oracle 10g's character set section) One issue with the per schema approach is statement caching. Currently statement caching at a per-schema approach so there's no problem. However for statements that don't depend on the current schema it would be good to cache them across schemas, e.g. SELECT * FROM A.T Especially when the default schema for a user is specific to that user. With string literals taking information from the current schema, now a statement like: SELECT * FROM A.T WHERE TYPE = 'CAR' will be dependent on the current schema, thus not shareable (due to the collation for 'CAR' requring a lookup of the current schema) In fact thinking about it, it does look strange that such a statement is dependent on the collation of the current schema, I'm not sure that's what an application developer will be expecting when they write a statement like that (principle of least surprise). I'm not sure what's right here, just trying to expand the discussion so all angles have been looked at. In some ways it would seem useful in TYPE = 'CAR' for 'CAR' to take on the collation of TYPE, but it's clear in the SQL standard that both have implict collation derivation (even though we can't work out what the collationtype of 'CAR' is defined to be).
          Hide
          Mamta A. Satoor added a comment -

          I am closing this Jira entry. If we find something concrete in SQL spec which requires change in the current behavior, then we can open another Jira entry.

          Show
          Mamta A. Satoor added a comment - I am closing this Jira entry. If we find something concrete in SQL spec which requires change in the current behavior, then we can open another Jira entry.
          Hide
          Myrna van Lunteren added a comment -

          as no code has gone in, I think this should not get status 'fixed'.

          Show
          Myrna van Lunteren added a comment - as no code has gone in, I think this should not get status 'fixed'.

            People

            • Assignee:
              Mamta A. Satoor
              Reporter:
              Mamta A. Satoor
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development