Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 10.3.1.4
    • Component/s: SQL
    • Labels:
      None

      Description

      JPA is requiring databases to support this ANSI feature esp the ability to chose the trimmed character

      TRIM([ [ LEADING | TRAILING | BOTH ] [ chars ] FROM ] str)

      1. 1623-parser-guess.diff
        4 kB
        Andrew McIntyre
      2. AnsiTrimTest.java
        2 kB
        Manish Khettry
      3. compile_error.jj
        2 kB
        Andrew McIntyre
      4. sqlgrammar.jj.diff
        3 kB
        Manish Khettry
      5. 1623.patch.txt
        20 kB
        Manish Khettry
      6. 1623_take2.patch.txt
        21 kB
        Manish Khettry
      7. d1623_v3.patch
        21 kB
        A B

        Issue Links

          Activity

          Hide
          Steve Ebersole added a comment -

          FWIW, an ANSI SQL compliant TRIM function would be fabulous. But even a REPLACE function would allow JPA providers to emulate the ANSI SQL TRIM functionality. Right now in Hibernate, for example, we emulate this on SQL Server and Sybase because neither support the full TRIM definition, but each do support LTRIM, RTRIM, and REPLACE which we use to achieve this emulation.

          Show
          Steve Ebersole added a comment - FWIW, an ANSI SQL compliant TRIM function would be fabulous. But even a REPLACE function would allow JPA providers to emulate the ANSI SQL TRIM functionality. Right now in Hibernate, for example, we emulate this on SQL Server and Sybase because neither support the full TRIM definition, but each do support LTRIM, RTRIM, and REPLACE which we use to achieve this emulation.
          Hide
          Daniel John Debrunner added a comment -

          I assume the purpose of the issue is to add the feature, not just sit around and "consider" it.

          Show
          Daniel John Debrunner added a comment - I assume the purpose of the issue is to add the feature, not just sit around and "consider" it.
          Hide
          Manish Khettry added a comment -

          I'm thinking of adding an optional argument to the existing LTRIM and RTRIM functions. This will be a string literal and is basically the set of chars to trim.

          RTRIM('abcxyzxx', 'zyx') => abc.

          Also a new function TRIM with the same arguments as L/R TRIM which trims both leading and trailing characters. If anyone has any thoughts let me know.

          Show
          Manish Khettry added a comment - I'm thinking of adding an optional argument to the existing LTRIM and RTRIM functions. This will be a string literal and is basically the set of chars to trim. RTRIM('abcxyzxx', 'zyx') => abc. Also a new function TRIM with the same arguments as L/R TRIM which trims both leading and trailing characters. If anyone has any thoughts let me know.
          Hide
          A B added a comment -

          > I'm thinking of adding an optional argument to the existing LTRIM and RTRIM functions. This will
          > be a string literal and is basically the set of chars to trim.

          I'm assuming this new functionality is defined by the SQL standard, but I'm not familiar enough with said standard to find it (I couldn't even find LTRIM or RTRIM). Can you point me to the relevant sections of the relevant spec? Sorry for my ignorance here...

          Show
          A B added a comment - > I'm thinking of adding an optional argument to the existing LTRIM and RTRIM functions. This will > be a string literal and is basically the set of chars to trim. I'm assuming this new functionality is defined by the SQL standard, but I'm not familiar enough with said standard to find it (I couldn't even find LTRIM or RTRIM). Can you point me to the relevant sections of the relevant spec? Sorry for my ignorance here...
          Hide
          Manish Khettry added a comment -

          Its pretty much in the description of this bug. I found this as well http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt.

          Show
          Manish Khettry added a comment - Its pretty much in the description of this bug. I found this as well http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt .
          Hide
          Bernt M. Johnsen added a comment -

          The SQL 2003 standard defines the TRIM function like this:

          <trim function> ::= TRIM <left paren> <trim operands> <right paren>
          <trim operands> ::= [ [ <trim specification> ] [ <trim character> ] FROM ] <trim source>
          <trim source> ::= <character value expression>
          <trim specification> ::= LEADING | TRAILING | BOTH
          <trim character> ::= <character value expression>

          Note that the LTRIM/RTRIM functions are defined in the JDBC spec as scalar string functions that a JDBC driver should support as escaped function calls:

          {fn LTRIM(....)}

          etc.

          I think the most proper solution here is to implement the SQL TRIM function, and keep LTRIM/RTRIM as is, as specified in the JDBC spec (for upward compatability, the escape syntax should not be required).

          Show
          Bernt M. Johnsen added a comment - The SQL 2003 standard defines the TRIM function like this: <trim function> ::= TRIM <left paren> <trim operands> <right paren> <trim operands> ::= [ [ <trim specification> ] [ <trim character> ] FROM ] <trim source> <trim source> ::= <character value expression> <trim specification> ::= LEADING | TRAILING | BOTH <trim character> ::= <character value expression> Note that the LTRIM/RTRIM functions are defined in the JDBC spec as scalar string functions that a JDBC driver should support as escaped function calls: {fn LTRIM(....)} etc. I think the most proper solution here is to implement the SQL TRIM function, and keep LTRIM/RTRIM as is, as specified in the JDBC spec (for upward compatability, the escape syntax should not be required).
          Hide
          Bernt M. Johnsen added a comment -

          BTW: This is feature E021-09 (See http://wiki.apache.org/db-derby/SQLvsDerbyFeatures )

          Show
          Bernt M. Johnsen added a comment - BTW: This is feature E021-09 (See http://wiki.apache.org/db-derby/SQLvsDerbyFeatures )
          Hide
          A B added a comment -

          Thank you, Bernt, that's what I was looking for (esp. the part about LTRIM and RTRIM).

          Show
          A B added a comment - Thank you, Bernt, that's what I was looking for (esp. the part about LTRIM and RTRIM).
          Hide
          Manish Khettry added a comment -

          Yes, I agree the better thing to do is to implement the sql standard and leave ltrim/rtrimalone.

          I don't have ready access to the sql standard so I'm a litlle unsure about the <trim character>. I'm taking it as a string literal. Looking at this page, http://www.oreilly.com/catalog/sqlnut/chapter/ch04.html, it looks like, it is interpreted as a string that needs to be removed. i.e.

          trim(leading 'ab' from 'bac') => bac
          trim(leading 'ba' from 'bac') => c

          and not as I'd thought earlier a set of characters to be removed.

          Show
          Manish Khettry added a comment - Yes, I agree the better thing to do is to implement the sql standard and leave ltrim/rtrimalone. I don't have ready access to the sql standard so I'm a litlle unsure about the <trim character>. I'm taking it as a string literal. Looking at this page, http://www.oreilly.com/catalog/sqlnut/chapter/ch04.html , it looks like, it is interpreted as a string that needs to be removed. i.e. trim(leading 'ab' from 'bac') => bac trim(leading 'ba' from 'bac') => c and not as I'd thought earlier a set of characters to be removed.
          Hide
          Bernt M. Johnsen added a comment -

          The standard says (Ch. 6.29):

          Syntax Rules:

          11) If <trim function> is specified, then
          a) Case:
          i) If FROM is specified, then:
          1) Either <trim specification> or <trim character> or both shall be specified.
          2) If <trim specification> is not specified, then BOTH is implicit.
          3) If <trim character> is not specified, then ' ' is implicit.
          ii) Otherwise, let SRC be <trim source>. TRIM ( SRC ) is equivalent to TRIM ( BOTH ' ' FROM SRC ).
          b) Case:
          i) If the declared type of <character value expression> is fixed-length character string or variablelength
          character string, then the declared type of the <trim function> is variable-length character
          string with maximum length equal to the fixed length or maximum variable length of the <trim
          source>.
          ii) Otherwise, the declared type of the <trim function> is a character large object type with maximum
          length equal to the maximum variable length of the <trim source>.
          c) If a <trim character> is specified, then <trim character> and <trim source> shall be comparable.
          d) The declared type of the <trim function> is that of the <trim source>.

          General Rules:
          9) If <trim function> is specified, then:
          a) Let S be the value of the <trim source>.
          b) If <trim character> is specified, then let SC be the value of <trim character>; otherwise, let SC be <space>.
          c) If either S or SC is the null value, then the result of the <trim function> is the null value.
          d) If the length in characters of SC is not 1 (one), then an exception condition is raised: data exception — trim error.
          e) Case:
          i) If BOTH is specified or if no <trim specification> is specified, then the result of the <trim function> is the value of S with any leading or trailing characters equal to SC removed.
          ii) If TRAILING is specified, then the result of the <trim function> is the value of S with any trailing characters equal to SC removed.
          iii) If LEADING is specified, then the result of the <trim function> is the value of S with any leading characters equal to SC removed.

          So we can safely assume that <trim character> is one character.

          Another clarification: Both <trim character> and <trim source> are <character value expression> which means they may be literals, expressions (e.g string functions, concatenations etc) or column refernces. Thus

          create table tt (v varchar(256), t varchar(10));
          select TRIM(TRAILING t FROM v);

          is legal as long as that values of t is of length 1.

          Show
          Bernt M. Johnsen added a comment - The standard says (Ch. 6.29): Syntax Rules: 11) If <trim function> is specified, then a) Case: i) If FROM is specified, then: 1) Either <trim specification> or <trim character> or both shall be specified. 2) If <trim specification> is not specified, then BOTH is implicit. 3) If <trim character> is not specified, then ' ' is implicit. ii) Otherwise, let SRC be <trim source>. TRIM ( SRC ) is equivalent to TRIM ( BOTH ' ' FROM SRC ). b) Case: i) If the declared type of <character value expression> is fixed-length character string or variablelength character string, then the declared type of the <trim function> is variable-length character string with maximum length equal to the fixed length or maximum variable length of the <trim source>. ii) Otherwise, the declared type of the <trim function> is a character large object type with maximum length equal to the maximum variable length of the <trim source>. c) If a <trim character> is specified, then <trim character> and <trim source> shall be comparable. d) The declared type of the <trim function> is that of the <trim source>. General Rules: 9) If <trim function> is specified, then: a) Let S be the value of the <trim source>. b) If <trim character> is specified, then let SC be the value of <trim character>; otherwise, let SC be <space>. c) If either S or SC is the null value, then the result of the <trim function> is the null value. d) If the length in characters of SC is not 1 (one), then an exception condition is raised: data exception — trim error. e) Case: i) If BOTH is specified or if no <trim specification> is specified, then the result of the <trim function> is the value of S with any leading or trailing characters equal to SC removed. ii) If TRAILING is specified, then the result of the <trim function> is the value of S with any trailing characters equal to SC removed. iii) If LEADING is specified, then the result of the <trim function> is the value of S with any leading characters equal to SC removed. So we can safely assume that <trim character> is one character. Another clarification: Both <trim character> and <trim source> are <character value expression> which means they may be literals, expressions (e.g string functions, concatenations etc) or column refernces. Thus create table tt (v varchar(256), t varchar(10)); select TRIM(TRAILING t FROM v); is legal as long as that values of t is of length 1.
          Hide
          Bernt M. Johnsen added a comment -

          The SQL example above should of course be:

          create table tt (v varchar(256), t varchar(10));
          select TRIM(TRAILING t FROM v) FROM tt;

          Show
          Bernt M. Johnsen added a comment - The SQL example above should of course be: create table tt (v varchar(256), t varchar(10)); select TRIM(TRAILING t FROM v) FROM tt;
          Hide
          Manish Khettry added a comment -

          I have spent several days but I cannot seem to express the grammar for trim without warnings from JavaCC. The warning says:

          [java] Warning: Choice conflict involving two expansions at
          [java] line 6268, column 5 and line 6279, column 5 respectively.
          [java] A common prefix is: "leading" "+"
          [java] Consider using a lookahead of 3 or more for earlier expansion

          I've included a skeleton of the production rules that I'm using. I'm also not sure if additiveExpression is the right production to use in this case.

          ValueNode
          ansiTrimProduction() throws StandardException :
          {
          }
          {
          <LEADING> additiveExpression(null, 0, false)

          { return null; //TODO obviously return a ValueNode when warnings are fixed. }

          LOOKAHEAD ( <LEADING> <FROM>)
          <LEADING> <FROM> additiveExpression(null,0,false)

          { return null; }
          |
          <LEADING> additiveExpression(null,0,false) <FROM> additiveExpression(null,0,false)
          { return null; }

          }

          From reading the JavaCC documentation and this tutorial (https://javacc.dev.java.net/doc/lookahead.html) it seems the parser does not which production (the first or the third) to take when it reads a token like . Using a fixed lookahead to disambiguate between the two productions is not possible in this case. For now, I'm sort of stuck at this point. Any ponters would be appreciated.

          Show
          Manish Khettry added a comment - I have spent several days but I cannot seem to express the grammar for trim without warnings from JavaCC. The warning says: [java] Warning: Choice conflict involving two expansions at [java] line 6268, column 5 and line 6279, column 5 respectively. [java] A common prefix is: "leading" "+" [java] Consider using a lookahead of 3 or more for earlier expansion I've included a skeleton of the production rules that I'm using. I'm also not sure if additiveExpression is the right production to use in this case. ValueNode ansiTrimProduction() throws StandardException : { } { <LEADING> additiveExpression(null, 0, false) { return null; //TODO obviously return a ValueNode when warnings are fixed. } LOOKAHEAD ( <LEADING> <FROM>) <LEADING> <FROM> additiveExpression(null,0,false) { return null; } | <LEADING> additiveExpression(null,0,false) <FROM> additiveExpression(null,0,false) { return null; } } From reading the JavaCC documentation and this tutorial ( https://javacc.dev.java.net/doc/lookahead.html ) it seems the parser does not which production (the first or the third) to take when it reads a token like . Using a fixed lookahead to disambiguate between the two productions is not possible in this case. For now, I'm sort of stuck at this point. Any ponters would be appreciated.
          Hide
          Andrew McIntyre added a comment -

          Hi Manish,

          I took a stab at implementing the TRIM grammar that Bernt described. This generates a valid grammar via javacc, but only because the call to columnReference in the new characterValueExpression is commented out. Uncommented, that generates the following error:

          [java] Warning: Choice conflict involving two expansions at
          [java] line 6183, column 5 and line 6188, column 5 respectively.
          [java] A common prefix is: "ucase"
          [java] Consider using a lookahead of 2 for earlier expansion.

          so some work would be needed to disambiguate characterValueFunction and columnReference.

          And of course, execution is an entirely separate matter, but I thought this might give you some food for thought.

          Show
          Andrew McIntyre added a comment - Hi Manish, I took a stab at implementing the TRIM grammar that Bernt described. This generates a valid grammar via javacc, but only because the call to columnReference in the new characterValueExpression is commented out. Uncommented, that generates the following error: [java] Warning: Choice conflict involving two expansions at [java] line 6183, column 5 and line 6188, column 5 respectively. [java] A common prefix is: "ucase" [java] Consider using a lookahead of 2 for earlier expansion. so some work would be needed to disambiguate characterValueFunction and columnReference. And of course, execution is an entirely separate matter, but I thought this might give you some food for thought.
          Hide
          Andrew McIntyre added a comment -

          Adding LOOKAHEAD(2) before characterValueFunction in characterValueExpression does resolve the ambiguous 'ucase', reattaching my parser diff with that lookahead added.

          Show
          Andrew McIntyre added a comment - Adding LOOKAHEAD(2) before characterValueFunction in characterValueExpression does resolve the ambiguous 'ucase', reattaching my parser diff with that lookahead added.
          Hide
          Manish Khettry added a comment -

          Andrew

          Thanks for taking the time to take a look at this. I took your grammar and wrote some unit tests for it (only the parsing part). I've attached the Test. As you can see, the characterValueExpression production does not cover all the cases which can apply to trimCharacter or trimSource.

          Second, I was trying to make the tokens LEADING/TRAILING/BOTH as non reserved keyword. I'm not sure if that is going possible. For example in a query like:

          select LEADING from col.

          is LEADING a trimspec or is it a column name? The problem with making these reserved keywords is that it may break existing code.

          I will spend more time when its available to working on this-- if anyone else has thoughts on how to come up with an unambiguous grammar, please do jump in.

          Show
          Manish Khettry added a comment - Andrew Thanks for taking the time to take a look at this. I took your grammar and wrote some unit tests for it (only the parsing part). I've attached the Test. As you can see, the characterValueExpression production does not cover all the cases which can apply to trimCharacter or trimSource. Second, I was trying to make the tokens LEADING/TRAILING/BOTH as non reserved keyword. I'm not sure if that is going possible. For example in a query like: select LEADING from col. is LEADING a trimspec or is it a column name? The problem with making these reserved keywords is that it may break existing code. I will spend more time when its available to working on this-- if anyone else has thoughts on how to come up with an unambiguous grammar, please do jump in.
          Hide
          Manish Khettry added a comment -

          Unit test for parsing trim expressions.

          Show
          Manish Khettry added a comment - Unit test for parsing trim expressions.
          Hide
          Andrew McIntyre added a comment -

          Hi Manish,

          SQL99 does list LEADING, TRAILING, and BOTH as reserved words. Also, I don't think that making them non-reserved keywords is going to be possible with the current parser implementation, since if you place them in the non-reserved list, you will necessarily get a conflict between the first token (in my suggestion, from ansiTrimType()) and the expansion of the non-optional token - in my example, characterValueExpression() - which can be a columnReference and thus just an identifier. The rule for identifiers currently includes the list of non-reserved keywords, which I suspect would lead to a conflict in any implementation which checks non-reserved keywords against identifiers. While there is a possibility for impact to existing applications, I'm wondering how serious the impact would be since the SQL99 standard has the keywords in question listed as reserved.

          As for the case expression not working, it doesn't appear that SQL99 includes case expressions in <character value expression> or <character value function>, but it is very likely that I'm missing something in the spec. If such expressions in column references are allowed, and this is something other databases support, we should give it consideration.

          I'd love to hear more feedback, and I'll dig some more into the standards to see what I can find.

          Show
          Andrew McIntyre added a comment - Hi Manish, SQL99 does list LEADING, TRAILING, and BOTH as reserved words. Also, I don't think that making them non-reserved keywords is going to be possible with the current parser implementation, since if you place them in the non-reserved list, you will necessarily get a conflict between the first token (in my suggestion, from ansiTrimType()) and the expansion of the non-optional token - in my example, characterValueExpression() - which can be a columnReference and thus just an identifier. The rule for identifiers currently includes the list of non-reserved keywords, which I suspect would lead to a conflict in any implementation which checks non-reserved keywords against identifiers. While there is a possibility for impact to existing applications, I'm wondering how serious the impact would be since the SQL99 standard has the keywords in question listed as reserved. As for the case expression not working, it doesn't appear that SQL99 includes case expressions in <character value expression> or <character value function>, but it is very likely that I'm missing something in the spec. If such expressions in column references are allowed, and this is something other databases support, we should give it consideration. I'd love to hear more feedback, and I'll dig some more into the standards to see what I can find.
          Hide
          Manish Khettry added a comment -

          Andrew,

          I am going by snippets of the spec posted in this bug. I did take <character value expression> to mean any expression that yields a char/varchar/clob value. I also tried a few queries in mysql which did accept case and other expressions. Does DB2 accept them as well?

          Show
          Manish Khettry added a comment - Andrew, I am going by snippets of the spec posted in this bug. I did take <character value expression> to mean any expression that yields a char/varchar/clob value. I also tried a few queries in mysql which did accept case and other expressions. Does DB2 accept them as well?
          Hide
          Andrew McIntyre added a comment -

          Hi Manish,

          You are of course correct, any expression that returns a character data type is allowed. I just couldn't find it in the few minutes that I took last night to look in the spec. Found it today, the relevant portion is:

          <character value expression> ::=
          <concatenation>

          <character factor>

          <character factor> ::=
          <character primary> [ <collate clause> ]

          <character primary> ::=
          <value expression primary>

          <string value function>

          and valueExpressionPrimary() in the parser covers everything necessary here. I expected to be able to replace characterValueExpression() with:

          ValueNode
          characterValueExpression() throws StandardException:

          { ValueNode expr; }

          {
          LOOKAHEAD(2)
          expr = valueExpressionPrimary(false)

          { return expr; }
          |
          expr = characterValueFunction()
          { return expr; }

          }

          JavaCC does not complain about this, but generates an invalid grammar: you get a compile error because JavaCC split the parse tree for the subquery choice of valueExpressionPrimary but failed to push the inSelectClause boolean that was a part of the valueExpressionPrimary declaration into the new node. JavaCC bug? Not sure, i'll investigate later.

          Show
          Andrew McIntyre added a comment - Hi Manish, You are of course correct, any expression that returns a character data type is allowed. I just couldn't find it in the few minutes that I took last night to look in the spec. Found it today, the relevant portion is: <character value expression> ::= <concatenation> <character factor> <character factor> ::= <character primary> [ <collate clause> ] <character primary> ::= <value expression primary> <string value function> and valueExpressionPrimary() in the parser covers everything necessary here. I expected to be able to replace characterValueExpression() with: ValueNode characterValueExpression() throws StandardException: { ValueNode expr; } { LOOKAHEAD(2) expr = valueExpressionPrimary(false) { return expr; } | expr = characterValueFunction() { return expr; } } JavaCC does not complain about this, but generates an invalid grammar: you get a compile error because JavaCC split the parse tree for the subquery choice of valueExpressionPrimary but failed to push the inSelectClause boolean that was a part of the valueExpressionPrimary declaration into the new node. JavaCC bug? Not sure, i'll investigate later.
          Hide
          Andrew McIntyre added a comment -

          The compile error I mentioned in the last comment may be a JavaCC bug, attaching a simplified grammar which reproduces the compile error mentioned in the previous comment. It takes some effort to be able to subscribe / post to their mailing list, so it might be a few days before I can confirm with them this is really a JavaCC bug - or, more likely, my bad grammar - and work with them on a workaround or fix.

          Show
          Andrew McIntyre added a comment - The compile error I mentioned in the last comment may be a JavaCC bug, attaching a simplified grammar which reproduces the compile error mentioned in the previous comment. It takes some effort to be able to subscribe / post to their mailing list, so it might be a few days before I can confirm with them this is really a JavaCC bug - or, more likely, my bad grammar - and work with them on a workaround or fix.
          Hide
          Andrew McIntyre added a comment -

          Posted to the javacc users list concerning the compile error when using semantic lookahead. I'll copy any feedback I get here.

          Show
          Andrew McIntyre added a comment - Posted to the javacc users list concerning the compile error when using semantic lookahead. I'll copy any feedback I get here.
          Hide
          Manish Khettry added a comment -

          I think this grammar would work. What do you think Andrew?

          Show
          Manish Khettry added a comment - I think this grammar would work. What do you think Andrew?
          Hide
          Andrew McIntyre added a comment -

          Hi Manish,

          Sorry I didn't have time to come back to this sooner. Your new grammar looks great. It avoids the multiple levels of lookahead into valueExpressionPrimary that led to the compile error that I was seeing, and I tried throwing numerous malformed and properly formed trim statements at it and everything worked pretty much as expected, including statements with dynamic parameters and subqueries. I didn't come close to trying out all the possible paths through valueExpressionPrimary, but all the obvious ones were fine.

          I did notice that the following did not parse:

          select v from t where TRIM(v) = TRIM(c)

          where other similar statements such as:

          select v from t where TRIM(v) = c
          select v from t where v = TRIM(LEADING FROM c)

          parsed ok. I haven't looked at it any further than throwing a bunch of different statements at it, this might be something to come back and investigate at a later date.

          Show
          Andrew McIntyre added a comment - Hi Manish, Sorry I didn't have time to come back to this sooner. Your new grammar looks great. It avoids the multiple levels of lookahead into valueExpressionPrimary that led to the compile error that I was seeing, and I tried throwing numerous malformed and properly formed trim statements at it and everything worked pretty much as expected, including statements with dynamic parameters and subqueries. I didn't come close to trying out all the possible paths through valueExpressionPrimary, but all the obvious ones were fine. I did notice that the following did not parse: select v from t where TRIM(v) = TRIM(c) where other similar statements such as: select v from t where TRIM(v) = c select v from t where v = TRIM(LEADING FROM c) parsed ok. I haven't looked at it any further than throwing a bunch of different statements at it, this might be something to come back and investigate at a later date.
          Hide
          Manish Khettry added a comment -

          Thanks Andrew-- if you have those queries saved up, it would be a good starting point for writing tests. BTW, I did try this query and it came back with a NPE which is a good thing

          ij> select v from t where TRIM(v) = TRIM(c)
          ;
          ERROR XJ001: Java exception: ': java.lang.NullPointerException'.

          Show
          Manish Khettry added a comment - Thanks Andrew-- if you have those queries saved up, it would be a good starting point for writing tests. BTW, I did try this query and it came back with a NPE which is a good thing ij> select v from t where TRIM(v) = TRIM(c) ; ERROR XJ001: Java exception: ': java.lang.NullPointerException'.
          Hide
          Manish Khettry added a comment -

          I apologize for not being able to wrap this up sooner-- work has taken up more than 40 hours of time these last few weeks.

          trim,leading,trailing,both are now reserved keywords.

          Show
          Manish Khettry added a comment - I apologize for not being able to wrap this up sooner-- work has taken up more than 40 hours of time these last few weeks. trim,leading,trailing,both are now reserved keywords.
          Hide
          Manish Khettry added a comment -

          The bugfix is in the attachment, 1623.patch.txt

          Show
          Manish Khettry added a comment - The bugfix is in the attachment, 1623.patch.txt
          Hide
          A B added a comment -

          Hi Manish,

          Thank your for your work on this. I did a review of the changes and have the following comments/questions, in no particular order:

          1. Following fails with an NPE. SQL standard says the result should be NULL (General Rules 9.c as posted by Bernt above).

          create table ts (c char(1), vc varchar(10));
          insert into ts values (default, default);

          – Note "c" here is null.
          select trim(c from ' hmm ') from ts;

          – And similarly...
          select trim ((values cast (null as char(1))) from vc) from ts;

          2. Error message 22020 says that the trim string cannot be null, but the SQL standard allows it to be (per General Rules 9.c noted above). Were you intentionally planning to enforce this restriction, or is this just a typo? If it was not intentional, then I think the corresponding text can be removed the the error message.

          3. There is an extra space between "string" and "must" in the error message for 22020:

          ERROR 22020: Invalid trim string, ''. The trim string must be exactly one character. It
          cannot be a null or more than one character.

          4. The SQL 2003 standard indicates that TRIM is a reserved keyword, and one of your previous comments says that, as well. However, I don't see "TRIM" in the list of reserved keywords in sqlgrammar.jj. There is a code comment which says:

          /* NOTE - If you add a keyword, then you must add it to reservedKeyword()

          • or nonReservedKeyword() as well!
            */

          and I see that "TRIM" has been added as a keyword with your patch, but I don't see it listed in either reservedKeyword() nor in nonReservedKeyword(). I also noticed that this patch adds "TRIM" to the "miscBuiltins()" method, which means that it can be used (in some form) with the JDBC escape syntax, ex:

          values

          { fn trim(' a ') }

          ;

          Was that intentional? If so, is this function defined in the JDBC spec like LTRIM and RTRIM are, and does ANSI behavior satisfy the JDBC requirements? (sorry for my ignorance here).

          6. Is the "trim(...)" method on StringDataValue still used anywhere after these changes? It looks like you replaced it with "ansiTrim", though I could of course be overlooking something. If the old "trim" method is no longer used, would it make sense to remove the code from SQLChar?

          7. I'm not entirely sure, but I think that use of statements like the following in JUnit tests is not ideal:

          rs = prepareStatement(sql).executeQuery();

          The reason is (I think) that the PreparedStatement object is never explicitly closed. There have been situations in the past where failure to close a statement leads to problems in the JUnit suite. I can't say what the specifics are, but if possible I think it would be better to explicitly assign the prepared statement and close it when appropriate. Fortunately there are only two such cases in the new test, so this should be pretty easy to change.

          8. Do you have any plans to document this new function? It might be good to create a subtask for tracking, so that the documentation can be completed at some point (even if it's not complete for 10.3, it would be good to have a Jira so that we don't lose track of it).

          I ran some simple tests by hand and everything seems to work (with the exception of the NPE noted above). I did not, however, run either of the nightly suites. Did you run derbyall and suites.All with this patch applied, and did everything run cleanly?

          Thanks again for your time with this feature!

          Show
          A B added a comment - Hi Manish, Thank your for your work on this. I did a review of the changes and have the following comments/questions, in no particular order: 1. Following fails with an NPE. SQL standard says the result should be NULL (General Rules 9.c as posted by Bernt above). create table ts (c char(1), vc varchar(10)); insert into ts values (default, default); – Note "c" here is null. select trim(c from ' hmm ') from ts; – And similarly... select trim ((values cast (null as char(1))) from vc) from ts; 2. Error message 22020 says that the trim string cannot be null, but the SQL standard allows it to be (per General Rules 9.c noted above). Were you intentionally planning to enforce this restriction, or is this just a typo? If it was not intentional, then I think the corresponding text can be removed the the error message. 3. There is an extra space between "string" and "must" in the error message for 22020: ERROR 22020: Invalid trim string, ''. The trim string must be exactly one character. It cannot be a null or more than one character. 4. The SQL 2003 standard indicates that TRIM is a reserved keyword, and one of your previous comments says that, as well. However, I don't see "TRIM" in the list of reserved keywords in sqlgrammar.jj. There is a code comment which says: /* NOTE - If you add a keyword, then you must add it to reservedKeyword() or nonReservedKeyword() as well! */ and I see that "TRIM" has been added as a keyword with your patch, but I don't see it listed in either reservedKeyword() nor in nonReservedKeyword(). I also noticed that this patch adds "TRIM" to the "miscBuiltins()" method, which means that it can be used (in some form) with the JDBC escape syntax, ex: values { fn trim(' a ') } ; Was that intentional? If so, is this function defined in the JDBC spec like LTRIM and RTRIM are, and does ANSI behavior satisfy the JDBC requirements? (sorry for my ignorance here). 6. Is the "trim(...)" method on StringDataValue still used anywhere after these changes? It looks like you replaced it with "ansiTrim", though I could of course be overlooking something. If the old "trim" method is no longer used, would it make sense to remove the code from SQLChar? 7. I'm not entirely sure, but I think that use of statements like the following in JUnit tests is not ideal: rs = prepareStatement(sql).executeQuery(); The reason is (I think ) that the PreparedStatement object is never explicitly closed. There have been situations in the past where failure to close a statement leads to problems in the JUnit suite. I can't say what the specifics are, but if possible I think it would be better to explicitly assign the prepared statement and close it when appropriate. Fortunately there are only two such cases in the new test, so this should be pretty easy to change. 8. Do you have any plans to document this new function? It might be good to create a subtask for tracking, so that the documentation can be completed at some point (even if it's not complete for 10.3, it would be good to have a Jira so that we don't lose track of it). I ran some simple tests by hand and everything seems to work (with the exception of the NPE noted above). I did not, however, run either of the nightly suites. Did you run derbyall and suites.All with this patch applied, and did everything run cleanly? Thanks again for your time with this feature!
          Hide
          Manish Khettry added a comment -

          Thanks for reviewing the code. All your points are valid and 1 is a pretty egregious mistake on my part. As far as 4 goes-- yes, it does have to be in reservedKeyword but is there a problem if its in miscBuiltinsCore? I needed to add the token there because the ansi trim is parsed inside the characterValueFunction production which in turn is under the miscBuiltinsCore production. Infact this comment in miscBuiltinsCore says:

          miscBuiltinsCore() are the core

          • system, string and numeric functions.
          • NOTE: date functions not currently considered
          • core for purposes of the grammar since
          • they can only be escaped when they appear
          • as functions (not special registers).

          So is trim not a core function?

          Show
          Manish Khettry added a comment - Thanks for reviewing the code. All your points are valid and 1 is a pretty egregious mistake on my part. As far as 4 goes-- yes, it does have to be in reservedKeyword but is there a problem if its in miscBuiltinsCore? I needed to add the token there because the ansi trim is parsed inside the characterValueFunction production which in turn is under the miscBuiltinsCore production. Infact this comment in miscBuiltinsCore says: miscBuiltinsCore() are the core system, string and numeric functions. NOTE: date functions not currently considered core for purposes of the grammar since they can only be escaped when they appear as functions (not special registers). So is trim not a core function?
          Hide
          A B added a comment -

          > So is trim not a core function?

          Sorry, I think I was confusing myself here. The issue I was trying to point out is that the patch exposes the TRIM function through the JDBC escaped function syntax, and I was wondering if that's correct--i.e. is TRIM actually a JDBC escaped function, or is just an SQL function?

          But it looks like this is an existing problem in the Derby handling of functions, per DERBY-591. So technically that's a different issue. It'd be nice if we could somehow avoid adding "TRIM" to the list of functions that are already incorrectly exposed via JDBC escape syntax. If, however, you feel like that's the out of the scope of this issue, then perhaps you could just add a comment to DERBY-591 indicating that "TRIM" is now part of that list, as well?

          Show
          A B added a comment - > So is trim not a core function? Sorry, I think I was confusing myself here. The issue I was trying to point out is that the patch exposes the TRIM function through the JDBC escaped function syntax, and I was wondering if that's correct--i.e. is TRIM actually a JDBC escaped function, or is just an SQL function? But it looks like this is an existing problem in the Derby handling of functions, per DERBY-591 . So technically that's a different issue. It'd be nice if we could somehow avoid adding "TRIM" to the list of functions that are already incorrectly exposed via JDBC escape syntax. If, however, you feel like that's the out of the scope of this issue, then perhaps you could just add a comment to DERBY-591 indicating that "TRIM" is now part of that list, as well?
          Hide
          Manish Khettry added a comment -

          Followup patch. Addresses all the issues raised by A. B. Ran derbylang and all lang junit tests.

          Show
          Manish Khettry added a comment - Followup patch. Addresses all the issues raised by A. B. Ran derbylang and all lang junit tests.
          Hide
          A B added a comment -

          > Addresses all the issues raised by A. B.

          Thanks for the updated patch, Manish. It looks good to me, with the exception of some minor formatting glitches (lines longer than 80 chars and tab/space mixups) that I can perhaps fix before committing.

          There are still two questions that have yet to be answered:

          6. Is the "trim(...)" method on StringDataValue still used anywhere after these changes?
          8. Do you have any plans to document this new function?

          Also, if this is going into 10.3 then I think it should be listed on the "buddy testing" wiki page for 10.3, correct?

          http://wiki.apache.org/db-derby/TenThreeBuddyTesting

          Show
          A B added a comment - > Addresses all the issues raised by A. B. Thanks for the updated patch, Manish. It looks good to me, with the exception of some minor formatting glitches (lines longer than 80 chars and tab/space mixups) that I can perhaps fix before committing. There are still two questions that have yet to be answered: 6. Is the "trim(...)" method on StringDataValue still used anywhere after these changes? 8. Do you have any plans to document this new function? Also, if this is going into 10.3 then I think it should be listed on the "buddy testing" wiki page for 10.3, correct? http://wiki.apache.org/db-derby/TenThreeBuddyTesting
          Hide
          Manish Khettry added a comment -

          Sorry, I guess not all issues were addressed.

          re: space vs tabs-- the new file AnsiTrimTest is all spaces I think. Please do not put any tabs in there!
          6-- trim in SQLChar/StringDataValue is not being generated by the code generator anymore. However I do not know if there are existing triggers or query plans on disk which refer to it so I did not remove it. If upgrade is not an issue then by all means the trim function should go away or ansiTrim should just be named trim.
          8. I hadn't thought of documentation but I could take a stab at it or atleast open a subtask for it.
          If it does make it to 10.3, then it should be buddy tested.

          Show
          Manish Khettry added a comment - Sorry, I guess not all issues were addressed. re: space vs tabs-- the new file AnsiTrimTest is all spaces I think. Please do not put any tabs in there! 6-- trim in SQLChar/StringDataValue is not being generated by the code generator anymore. However I do not know if there are existing triggers or query plans on disk which refer to it so I did not remove it. If upgrade is not an issue then by all means the trim function should go away or ansiTrim should just be named trim. 8. I hadn't thought of documentation but I could take a stab at it or atleast open a subtask for it. If it does make it to 10.3, then it should be buddy tested.
          Hide
          A B added a comment -

          Thank you for answering all of my question, Manish. I committed d1623_v3.patch with svn # 546183:

          URL: http://svn.apache.org/viewvc?view=rev&rev=546183

          The _v3 patch is the same as your latest patch, but with some minor formatting changes.

          Thanks for adding this feature!

          Please note: as a regular contributor to Derby we are hoping that you will take the time to complete an ICLA, per Myrna's email here:

          http://article.gmane.org/gmane.comp.apache.db.derby.devel/43960

          The sooner you are able to do so, the better...

          Show
          A B added a comment - Thank you for answering all of my question, Manish. I committed d1623_v3.patch with svn # 546183: URL: http://svn.apache.org/viewvc?view=rev&rev=546183 The _v3 patch is the same as your latest patch, but with some minor formatting changes. Thanks for adding this feature! Please note: as a regular contributor to Derby we are hoping that you will take the time to complete an ICLA, per Myrna's email here: http://article.gmane.org/gmane.comp.apache.db.derby.devel/43960 The sooner you are able to do so, the better...
          Hide
          A B added a comment -

          Marking issue as RESOLVED as the patch has been committed and tinderbox results ran cleanly. Note that the 10.3 deadline is two days away, so if you are still planning to document this issue for 10.3 (DERBY-2794), it might be good to start sooner rather than later...

          Show
          A B added a comment - Marking issue as RESOLVED as the patch has been committed and tinderbox results ran cleanly. Note that the 10.3 deadline is two days away, so if you are still planning to document this issue for 10.3 ( DERBY-2794 ), it might be good to start sooner rather than later...

            People

            • Assignee:
              Manish Khettry
              Reporter:
              Emmanuel Bernard
            • Votes:
              2 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development