Derby
  1. Derby
  2. DERBY-13

Quoted names with embedded period mishandled in from list

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 10.0.2.0
    • Fix Version/s: 10.1.1.0
    • Component/s: SQL
    • Labels:
      None

      Description

      Opening this bug on behalf of Satheesh Bandaram

      ---------------------------------------------------------
      The compiler mishandles quoted names with embedded periods when
      checking uniqueness of table names in the from list of a
      SELECT. Consider the following SQL:
      create table "S1.T1" (id int not null primary key, d1 int);
      create schema s1;
      create table s1.t1 (id int not null primary key, d2 int);
      select * from s1.t1, "S1.T1" where s1.t1.id = "S1.T1".id;
      Derby throws an error on the final SELECT statement:
      "ERROR 42X09: The table or alias name 'S1.T1' is used more than
      once in the FROM list". However s1.t1 and "S1.T1" are different
      tables.

      1. Derby-13.diff
        12 kB
        Shreyas Kaushik
      2. Derby-13.patch
        1 kB
        Shreyas Kaushik
      3. derbylang_report.txt
        12 kB
        Shreyas Kaushik

        Activity

        Hide
        Shreyas Kaushik added a comment -

        Attached is the patch for this bug. Please review and give comments.

        Show
        Shreyas Kaushik added a comment - Attached is the patch for this bug. Please review and give comments.
        Hide
        Jan Hlavatý added a comment -

        I believe SQL92 limits table names to letters, numbers and underscore characters (with underscore and number not being first character of the identifier), so if the first create table statement creates table with actual name "S1.T1" it is probably wrong.

        Some syntax definitions from ANSI X3.135-1992:

        <table name> ::= <qualified name> | <qualified local table name>
        <qualified local table name> ::= MODULE <period> <local table name>
        <local table name> ::= <qualified identifier>
        <qualified identifier> ::= <identifier>
        <qualified name> ::= [ <schema name> <period> ] <qualified identifier>
        <identifier> ::= [ <introducer><character set specification> ] <actual identifier>
        <actual identifier> ::= <regular identifier> | <delimited identifier>
        <regular identifier> ::= <identifier body>
        <identifier body> ::= <identifier start> [

        { <underscore> | <identifier part> }

        ... ]
        <identifier part> ::= <identifier start> | <digit>

        1) An <identifier start> is one of:
        a) A <simple Latin letter>; or
        b) A character that is identified as a letter in the character repertoire identified by the <module character set specification> or by the <character set specification>; or
        c) A character that is identified as a syllable in the character repertoire identified by the <module character set specification> or by the <character set specification>; or
        d) A character that is identified as an ideograph in the character repertoire identified by the <module character set specification> or by the <character set specification>.

        Show
        Jan Hlavatý added a comment - I believe SQL92 limits table names to letters, numbers and underscore characters (with underscore and number not being first character of the identifier), so if the first create table statement creates table with actual name "S1.T1" it is probably wrong. Some syntax definitions from ANSI X3.135-1992: <table name> ::= <qualified name> | <qualified local table name> <qualified local table name> ::= MODULE <period> <local table name> <local table name> ::= <qualified identifier> <qualified identifier> ::= <identifier> <qualified name> ::= [ <schema name> <period> ] <qualified identifier> <identifier> ::= [ <introducer><character set specification> ] <actual identifier> <actual identifier> ::= <regular identifier> | <delimited identifier> <regular identifier> ::= <identifier body> <identifier body> ::= <identifier start> [ { <underscore> | <identifier part> } ... ] <identifier part> ::= <identifier start> | <digit> 1) An <identifier start> is one of: a) A <simple Latin letter>; or b) A character that is identified as a letter in the character repertoire identified by the <module character set specification> or by the <character set specification>; or c) A character that is identified as a syllable in the character repertoire identified by the <module character set specification> or by the <character set specification>; or d) A character that is identified as an ideograph in the character repertoire identified by the <module character set specification> or by the <character set specification>.
        Hide
        Jack Klebanoff added a comment -

        The SQL standard does allow just about anything inside a <delimited identifier>. A delimited identifier is a string delimited by double quotes. For instance, the SQL standard allows
        create table "S1.T1" ...
        select * from "S1.T1"

        Show
        Jack Klebanoff added a comment - The SQL standard does allow just about anything inside a <delimited identifier>. A delimited identifier is a string delimited by double quotes. For instance, the SQL standard allows create table "S1.T1" ... select * from "S1.T1"
        Hide
        Rick Post added a comment -

        I think that the proposed patch is either incorrect or incomplete.

        1. There are two 'equals' methods in Tablename.java: one takes only a table name argument and the other takes both table name and schema name arguments. Don't they both need to be modified?

        2. The proposed change to the first 'equals' method in Tablename.java returns false if the argument table name is null. But the comment block at the beginning of the method states that 'TableNames are equal if their both their schemaNames and tableNames are

        • equal, or if this node's full table name is null'. If this comment is correct then 'true' should be returned if 'this nodes full table name is null' whether the argument is null or not. If this comment is not correct then it should be corrected. Since the same comment appears before the second 'equals' method the second method will need to be modified to reflect the same business rule.
          before the
        Show
        Rick Post added a comment - I think that the proposed patch is either incorrect or incomplete. 1. There are two 'equals' methods in Tablename.java: one takes only a table name argument and the other takes both table name and schema name arguments. Don't they both need to be modified? 2. The proposed change to the first 'equals' method in Tablename.java returns false if the argument table name is null. But the comment block at the beginning of the method states that 'TableNames are equal if their both their schemaNames and tableNames are equal, or if this node's full table name is null'. If this comment is correct then 'true' should be returned if 'this nodes full table name is null' whether the argument is null or not. If this comment is not correct then it should be corrected. Since the same comment appears before the second 'equals' method the second method will need to be modified to reflect the same business rule. before the
        Hide
        Satheesh Bandaram added a comment -

        I believe the proposed patch is incorrect. The patch would incorrectly accept statements like
        select * from t1 a, t2 a where a.i = 5

        The current checking for duplicate exposed names is correct. I think the problem here is caused because Derby currently stores both S1.T1 and "S1.T1" tables as having the same exposed name, S1.T1. At the compile time, we are not able tell if it was originally S1.T1 or "S1.T1"

        One solution might be to store "delimited identifiers" along with their quotes... In this case, one table would have the exposed name S1.T1 and the other "S1.T1". But this might have impact on some other parts of the code.

        Note that following queries are currently accepted, when both should fail:
        ij> select s1.t1.* from "S1.T1";
        ID |D1
        -----------------------

        0 rows selected
        ij> select "S1.T1".* from s1.t1;
        ID |D2
        -----------------------

        0 rows selected

        Show
        Satheesh Bandaram added a comment - I believe the proposed patch is incorrect. The patch would incorrectly accept statements like select * from t1 a, t2 a where a.i = 5 The current checking for duplicate exposed names is correct. I think the problem here is caused because Derby currently stores both S1.T1 and "S1.T1" tables as having the same exposed name, S1.T1. At the compile time, we are not able tell if it was originally S1.T1 or "S1.T1" One solution might be to store "delimited identifiers" along with their quotes... In this case, one table would have the exposed name S1.T1 and the other "S1.T1". But this might have impact on some other parts of the code. Note that following queries are currently accepted, when both should fail: ij> select s1.t1.* from "S1.T1"; ID |D1 ----------------------- 0 rows selected ij> select "S1.T1".* from s1.t1; ID |D2 ----------------------- 0 rows selected
        Hide
        Jan Hlavatý added a comment -

        I think you need to store table name as a pair <schma,name>, not as a single String.
        As delimited identifiers can contain arbitrary characters, you dont have a reliable delimiter character to separate, and any escaping mechanism would be costly.

        Show
        Jan Hlavatý added a comment - I think you need to store table name as a pair <schma,name>, not as a single String. As delimited identifiers can contain arbitrary characters, you dont have a reliable delimiter character to separate, and any escaping mechanism would be costly.
        Hide
        Daniel John Debrunner added a comment -

        Rather than inventing a new way, the exposed table name should be a TableName rather than a simple String. (org.apache.derby.impl.sql.compile.TableName). Some of the query nodes already have a method TableName getExposedTableName(), all instances and uses of String getExposedName() should be changed to TableName getExposedTableName().

        The change is probably a little more involved than that, but that should be the general idea.

        Show
        Daniel John Debrunner added a comment - Rather than inventing a new way, the exposed table name should be a TableName rather than a simple String. (org.apache.derby.impl.sql.compile.TableName). Some of the query nodes already have a method TableName getExposedTableName(), all instances and uses of String getExposedName() should be changed to TableName getExposedTableName(). The change is probably a little more involved than that, but that should be the general idea.
        Hide
        Shreyas Kaushik added a comment -

        In response to Sateesh's comments:

        Should the behaviour for the two queries you mentioned in your example, be taken care of at the compile time or bind time ? I feel they should be probably under a seperate bug, shouldn't they ? As this is likely to have a much larger impact.

        For this bug it is how the from list behaves and the comparision that needs to go in here, as Dan has suggested used the getExposedTableName approach.

        Show
        Shreyas Kaushik added a comment - In response to Sateesh's comments: Should the behaviour for the two queries you mentioned in your example, be taken care of at the compile time or bind time ? I feel they should be probably under a seperate bug, shouldn't they ? As this is likely to have a much larger impact. For this bug it is how the from list behaves and the comparision that needs to go in here, as Dan has suggested used the getExposedTableName approach.
        Hide
        Satheesh Bandaram added a comment -

        I think this is a bind issue, still. For the following query:

        select "S1.T1".* from s1.t1;

        "S1.T1" is being considered same as s1.t1. This seems to happen in FromList.java:

        public ResultColumnList getResultColumnsForList(String allTableName,ResultColumnList inputRcl,Name tableName)
        ........................
        /* If allTableName is non-null, then we must check to see if it matches

        • our exposed name.
          */
          if (allTableName != null && ! allTableName.equals(getExposedName())) { return null; }

        If all exposed names are being changed to TableNames, like Dan suggested, then this comparison here should not be a string based comparison.

        Show
        Satheesh Bandaram added a comment - I think this is a bind issue, still. For the following query: select "S1.T1".* from s1.t1; "S1.T1" is being considered same as s1.t1. This seems to happen in FromList.java: public ResultColumnList getResultColumnsForList(String allTableName,ResultColumnList inputRcl,Name tableName) ........................ /* If allTableName is non-null, then we must check to see if it matches our exposed name. */ if (allTableName != null && ! allTableName.equals(getExposedName())) { return null; } If all exposed names are being changed to TableNames, like Dan suggested, then this comparison here should not be a string based comparison.
        Hide
        Jan Hlavatý added a comment -

        Maybe the table names could be converted to canonical form before being used/compared?
        After all, s1.t1 is just an easy way to write "S1"."T1", and should be threated as such.

        Show
        Jan Hlavatý added a comment - Maybe the table names could be converted to canonical form before being used/compared? After all, s1.t1 is just an easy way to write "S1"."T1", and should be threated as such.
        Hide
        Shreyas Kaushik added a comment -

        In response to Sateesh's comment:

        Here in this method the allTableName is just a String and we cannot know the schema name for that table. Only for the table in the from list can the schema be known. So it is still a compile / bind time issue.

        Show
        Shreyas Kaushik added a comment - In response to Sateesh's comment: Here in this method the allTableName is just a String and we cannot know the schema name for that table. Only for the table in the from list can the schema be known. So it is still a compile / bind time issue.
        Hide
        Amit Handa added a comment -

        I think we need to understand that we are using String for TableName and not as an object of type TableName.java. Hence associating a schema with a table name which would otherwise have been possible(TableName.setSchemaName) cannot happen, is a problem.

        Secondly irrespective of whether a user gives a schema name (prefixes or not) we need to store the schema name(after clearly identifying the schema name), for every table name and cannot assume it to be NULL, else we get in problems like this.

        So

        1. either change from String tableName to TableName tableName.

        2. build a tableName object(of type TableName object) from the String and try to get handle to schema name (using the select query String as well).

        Note : the basic objective of either approach is to get handle to a schema name as and when it is required.

        1. is complex and may involve lot of changes wheras 2. is fairly simple.
        If we can achive using 2 that will be better.

        Show
        Amit Handa added a comment - I think we need to understand that we are using String for TableName and not as an object of type TableName.java. Hence associating a schema with a table name which would otherwise have been possible(TableName.setSchemaName) cannot happen, is a problem. Secondly irrespective of whether a user gives a schema name (prefixes or not) we need to store the schema name(after clearly identifying the schema name), for every table name and cannot assume it to be NULL, else we get in problems like this. So 1. either change from String tableName to TableName tableName. 2. build a tableName object(of type TableName object) from the String and try to get handle to schema name (using the select query String as well). Note : the basic objective of either approach is to get handle to a schema name as and when it is required. 1. is complex and may involve lot of changes wheras 2. is fairly simple. If we can achive using 2 that will be better.
        Hide
        Shreyas Kaushik added a comment -

        I am proposing a fix for this issue. I am attaching the diffs of all the files that need to have changes.

        As per the discussions we have had on this issue the problem was with comparing exposed names. I have changed the comparision of all exposed names to the TableName objects.

        As Sateesh had suggested in one of his earlier mails on this issue,

        public ResultColumnList getResultColumnsForList(String allTableName,ResultColumnList inputRcl,Name tableName)

        method in the FromTable.java the "allTableName" needs to be a TableName object instead of a String object. I have made those changes as well for the comparision to happen between TableName objects.

        For the above change to happen we need to get a handle to the TableName object that is stored in the java/engine/org/apache/derby/impl/sql/compile/AllResultColumn. As of now it does not have a method to actually get a handle to the TableName object stored in this. So I added a new method that returns a handle to the TableName object stored internally.

        I also ran the tests of derbylang and am attaching the diffs of them as well. Some of the tests fail there and I don't know the exact reason for the failures.( I think they are not failures )

        Show
        Shreyas Kaushik added a comment - I am proposing a fix for this issue. I am attaching the diffs of all the files that need to have changes. As per the discussions we have had on this issue the problem was with comparing exposed names. I have changed the comparision of all exposed names to the TableName objects. As Sateesh had suggested in one of his earlier mails on this issue, public ResultColumnList getResultColumnsForList(String allTableName,ResultColumnList inputRcl,Name tableName) method in the FromTable.java the "allTableName" needs to be a TableName object instead of a String object. I have made those changes as well for the comparision to happen between TableName objects. For the above change to happen we need to get a handle to the TableName object that is stored in the java/engine/org/apache/derby/impl/sql/compile/AllResultColumn. As of now it does not have a method to actually get a handle to the TableName object stored in this. So I added a new method that returns a handle to the TableName object stored internally. I also ran the tests of derbylang and am attaching the diffs of them as well. Some of the tests fail there and I don't know the exact reason for the failures.( I think they are not failures )
        Hide
        Shreyas Kaushik added a comment -

        Latest proposed patch for Derby-13

        Show
        Shreyas Kaushik added a comment - Latest proposed patch for Derby-13
        Hide
        Shreyas Kaushik added a comment -

        Test run report with diffs of tests that are failing

        Show
        Shreyas Kaushik added a comment - Test run report with diffs of tests that are failing
        Hide
        Daniel John Debrunner added a comment -

        Committed patch from Shreyas Kaushik <Shreyas.Kaushik@Sun.COM>

        svn revision 125980

        Show
        Daniel John Debrunner added a comment - Committed patch from Shreyas Kaushik <Shreyas.Kaushik@Sun.COM> svn revision 125980
        Hide
        Shreyas Kaushik added a comment -

        Fixed, as per the earlier comment from Dan

        Show
        Shreyas Kaushik added a comment - Fixed, as per the earlier comment from Dan
        Hide
        Shreyas Kaushik added a comment -

        Patch submitted

        Show
        Shreyas Kaushik added a comment - Patch submitted

          People

          • Assignee:
            Shreyas Kaushik
            Reporter:
            Ramandeep Kaur
          • Votes:
            2 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development