Issue Details (XML | Word | Printable)

Key: DERBY-7
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Minor Minor
Assignee: Mamta A. Satoor
Reporter: Tulika Agrawal
Votes: 6
Watchers: 5
Operations

If you were logged in you would be able to see more operations.
Derby

Bug in NULLIF Function

Created: 28/Sep/04 12:44 AM   Updated: 17/Oct/06 05:44 PM
Component/s: SQL
Affects Version/s: 10.0.2.0
Fix Version/s: 10.1.3.1, 10.2.1.6

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works Derby-7.txt 2005-01-10 04:57 PM Amit Handa 1 kB
File Licensed for inclusion in ASF works DERBY-903-changes-for-10.1-branch.diff 2006-05-05 07:45 PM Bernt M. Johnsen 0.8 kB
Text File Licensed for inclusion in ASF works derby-nullif.patch 2005-01-19 12:57 AM Jeremy Boynes 22 kB
Text File Licensed for inclusion in ASF works Derby7NullIf061005svndiff.txt 2005-06-11 05:59 AM Mamta A. Satoor 165 kB
Text File Licensed for inclusion in ASF works derby7Nullif080205svndiff.txt 2005-08-03 06:44 AM 166 kB
Issue Links:
Reference
 

Resolution Date: 05/May/06 07:48 PM


 Description  « Hide
Reporting for Christian d'Heureuse, filed on derby-dev list.

The NULLIF built-in function of Cloudscape 10.0.1.0 beta seems to accept
only string values.

Examples:

 values nullif('a','b');
 --> OK

 values nullif(1,2);
 --> Error message: "ERROR 42X89: Types 'CHAR' and
     'INTEGER' are not type compatible. (Neither type
     is assignable to the other type.)"


 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Amit Handa added a comment - 21/Dec/04 09:20 AM

The following needs to be commented out in ConditionalNode.bindExpression() as a solution for NULLIF to accept INT
In trunk/java/engine/org/pacahe/derby/impl/sql/compile/ConditionalNode.java



/* ALL COMMENTED OUT BELOW, NO NEED FOR FOLLOWING VALIDATION BETWEEN
         * thenExpression and elseExpression


        if (! thenExpression.getTypeCompiler().
comparable(elseExpression.getTypeId(), false, getClassFactory()) &&
! cu.assignableTo(thenExpression.getTypeId().getCorrespondingJavaTypeName(),
elseExpression.getTypeId().getCorrespondingJavaTypeName()) &&
! cu.assignableTo(elseExpression.getTypeId().getCorrespondingJavaTypeName(),
thenExpression.getTypeId().getCorrespondingJavaTypeName()))
{
throw StandardException.newException(SQLState.LANG_NOT_TYPE_COMPATIBLE,
thenExpression.getTypeId().getSQLTypeName(),
elseExpression.getTypeId().getSQLTypeName()
);

}
        * ALL COMMENTED OUT ABOVE
        */


Also I have checked various cases where the following code if it is removed may cause problems.
The cases are:-

1. An int and a char e.g. NULLIF(1,'pqr'); where pqr is a generic string and not a name of column.
It returns the message ERROR 42818: Comparisons between 'INTEGER' and 'CHAR' are not supported.

2. An int and a column_name passed. If column_type and other expression are of same type, it works. If type of column_name passed and the other expression are of differrent types it fails with a proper message ERROR 42818: Comparisons between 'INTEGER' and 'CHAR' are not supported.
i.e it works between type compatible expressions.

I will submit a patch soon.

Kindly verify and post any comments to this proposed solution.

Christian d'Heureuse added a comment - 21/Dec/04 10:56 AM
After commenting out these lines in ConditionalNode.bindExpression(), what data types are returned from the following expressions?

 nullif(1,2)
 nullif(cast(null as int),2)

These expressions should return INTEGER and not CHAR!

Amit Handa added a comment - 22/Dec/04 10:12 AM
It returns for

nullif(1,2) --> returns 1

nullif(cast(null as int),2) --> returns 0

Also it returns an int and not as a char. I took nullif() as one column of
select statement and it ResultSetMetaData returns java.sql.Types.INTEGER.

Further Oracle works in same way.







Christian d'Heureuse added a comment - 22/Dec/04 01:49 PM
> nullif(cast(null as int),2) --> returns 0

Should return 2?

Christian d'Heureuse added a comment - 22/Dec/04 03:23 PM
Sorry, ignore my previous comment. I had mistaken nullif() with isnull/coalesce().

The expression
 values nullif(1,1)
should return the value
 cast(null as int)

In sqlgrammar.jj there is the following comment:
// "NULLIF(L, R)" is the same as "L=R ? CAST(NULL AS CHAR(1)) : L"

I think the problem is:
 thenExpression = CAST(NULL AS CHAR(1)) = type CHAR
 elseExpression = L = type INT
==> CHAR and INT are not compatible.

To solve this problem, the data type of L (leftExpression) could be used to build the NULL constant node, instead of CHAR(1).

Satheesh Bandaram added a comment - 22/Dec/04 07:14 PM
Good observations... The defect has been around for sometime, but I think suggested fix by commenting out code in ConditionalNode is incorrect. If I remember right, that caused other issues.

Since there has been some interest in fixing this defect, I can volunteer to look at possible fix. Christian suggestion seems to be in the right direction, but I need to do more research.

Amit Handa added a comment - 23/Dec/04 02:05 AM
I don't think so I have seen anything on issues with Nullif on the mailing lists.
Can you elaborate more on what are the issues ?


Amit Handa added a comment - 27/Dec/04 10:14 AM
Before I comment on Christian's note, we need to read what nullif definition says at link below :-
http://incubator.apache.org/derby/manuals/reference/sqlj66.html#IDX910

Quoting from the above link is the next one line

For built-in types, this means that the types must be the same or a built-in broadening conversion must exist between the types.

So if we have say a CHAR and an INT to compare in NULLIF, the CHAR('1') will have a built in broadening and Hence get converted to INT(1)


> The expression
> values nullif(1,1)
> should return the value
> cast(null as int)

I agree with Christian here.

> In sqlgrammar.jj there is the following comment:
> // "NULLIF(L, R)" is the same as "L=R ? CAST(NULL AS CHAR(1)) : L"

Also to note was the next line after that in the same file, which says

// An impl assumption here is that Cloudscape can promote CHAR to any comparable datatypes such as numeric


> I think the problem is:
> thenExpression = CAST(NULL AS CHAR(1)) = type CHAR
> elseExpression = L = type INT
> ==> CHAR and INT are not compatible.

I disagree here. Rather it is this way

thenExpression = CAST(NULL AS CHAR(1)) = type CHAR
 elseExpression = L = type INT
check type of INT and CHAR, Since not compatible,
promote CHAR to INT(built in broadening),
compare INT and INT and do the comparison
==> return INT

Note that
CAST(NULL as Char(1)) WIll return CHAR

BUT in below
NULLIF(CAST(NULL as CHAR(1),1)

the CAST(NULL as CHAR(1) still returns 1 as CHAR
but gets converted to INT
as soon as it has to be compared to an INT.



>To solve this problem, the data type of L (leftExpression) could be used to build the NULL >constant node, instead of CHAR(1).

That is what is happenning, I hope the above helps in clarifying that
else I can explain that using code from ConditionalNode.bindExpression()






Christian d'Heureuse added a comment - 27/Dec/04 03:00 PM
I agree with the comments of Amit.

The following code in ConditionalNode.bindExpression() determines the result type of NULLIF():

/*
** Set the result type of this conditional to be the dominant type
** of the result expressions.
*/
setType(thenElseList.getDominantTypeServices());

DataTypeDescriptor.getDominantType() returns the type with the higher typePrecedence value.
The typePrecedence values for CHAR and INT are defined in TypeId.java:
 CHAR_PRECEDENCE = 0
 INT_PRECEDENCE = 50

So the "dominant" type between CHAR(1) and INT is INT.
The result type of NULLIF() is set to INT.
At the end of bindExpression(), a CAST node is inserted into thenExpression, to convert the NULL constant from CHAR to INT.

---

In ConditionalNode.bindExpression() there is the following comment:

/*
** If it is comparable, then we are ok. Note that we
** could in fact allow any expressions that are convertible()
** since we are going to generate a cast node, but that might
** be confusing to users...
*/

Maybe this explains the reason, why the type compatibility check (the one that Amit suggests to comment out) was added to bindExpression(): It "might be confusing to users..."?

To prevent this "user confusion", the term "built-in broadening conversion" should be clarified.
What is a "broadening conversion"? Is the return value type of NULLIF() the "broadened type"?
But then, why is a conversion from CHAR to INT a "broadening conversion"? CHAR would be the "broader" type than INT. (All INT values can be converted to CHAR, but not all CHAR values can be converted to INT.)
Or is the return type the "narrower" type?

An better solution would be to define that the result type of NULLIF() is the "dominant" type, which is the one that comes first in the following list:
USER-defined, BLOB, LONGVARBIT, VARBIT, BIT, BOOLEAN, TIME, TIMESTAMP, DATE, DOUBLE, REAL, DECIMAL, NUMERIC, LONGINT, INT, SMALLINT, TINYINT, REF, NATIONAL_LONGVARCHAR, NATIONAL_VARCHAR, NATIONAL_CHAR, CLOB, NCLOB, LONGVARCHAR, VARCHAR, CHAR.
For DECIMAL/NUMERIC types, scale and precision are broadened to matche both input types.

Satheesh Bandaram added a comment - 27/Dec/04 09:55 PM
There was a related bug that Derby tried to address before going OPEN SOURCE.

ij> values (case when 1.0=1.0 then null else 10 end);
ERROR 42X89: Types 'CHAR' and 'INTEGER' are not type compatible. (Neither type i
s assignable to the other type.)

The above statement should have returned NULL instead. Don't remember who worked on that bug, but I might have mistaken this NULLIF issue with that one. I think issues here may be slightly different...

Christian d'Heureuse added a comment - 27/Dec/04 10:45 PM
I think this is the same problem, because NULLIF() and CASE (whenThenExpression) both are implemented using CONDITIONAL_NODE.

But for CASE, the error can be avoided by using "CAST(NULL as INT)" instead of an untyped NULL:

  values (case when 1.0=1.0 then cast(null as int) else 10 end);

For NULLIF() this is not possible.

Jeremy Boynes added a comment - 28/Dec/04 01:05 AM
According to the SQL spec (ISO/IEC 9075-2:2003)

NULLIF (V1, V2) is equivalent to the following <case specification>:
CASE WHEN V1=V2 THEN NULL ELSE V1 END

and

The declared type of a <case specification> is determined by applying Subclause 9.3, "Data types of results of aggregations", to the declared types of all <result expression>s in the <case specification>.

9.3 is long but basically says that the type of the result will be the type of V1 as the type of NULL is undefined.

As part of this V1 and V2 must be comparable. Therefore, as I read it, NULLIF('a', 1) should raise an exception as 'a' cannot be converted to a numeric but NULLIF('1', 1) should return NULL with type CHAR(1) and NULLIF(1, '1') should return NULL with type INTEGER. Of course the type of NULL is fairly meaningless but it would have an impact if we were referencing columns instead.

Amit Handa added a comment - 28/Dec/04 08:46 AM
I think we agree on one thing for nullif(l,r),
the datatype of what is returned will be of type l.

Also

 The expression
 values nullif(1,1)
 should return the value
 cast(null as int)

gives the same result as output.

Some other things to note

1. nullif('a','b');
2. nullif(1,2);

3. nullif('a', 1);
4. nullif(1, 'a');
5. nullif(CAST(NULL as Char(1)), 1)
6. nullif(1, CAST(NULL as Char(1)))

Only 1 and 2 work with the fix, rest all fail.

So kindly let me know if you guys approve the fix.
If not, kindly update what is missing which needs to be fixed or is not being fixed with the patch proposed.

Please do vote in favour or against with some justification.

Christian d'Heureuse added a comment - 28/Dec/04 05:00 PM
I approve the fix and I agree that NULLIF(L,R) should return the data type of L (and does so with the fix).

But the proposed fix also affects the CASE expression.

 CASE WHEN BooleanExpression THEN thenExpression ELSE elseExpression END

The effect of the fix is that CASE accepts values that are not "type compatible" (for thenExpression and elseExpression), e.g. CHAR and INT.
CASE returns the "dominant" type of thenExpression and elseExpression (see DataTypeDescriptor.getDominantType()). The "dominant" type is the one that comes first in the list {USER-defined, BLOB, LONGVARBIT, ... CHAR} (according to the typePrecedence constants in TypeId.java). For DECIMAL/NUMERIC types, scale and precision are broadened to matche both input types.

The documentation of NULLIF and CASE in the reference manual should be complemented.
see http://incubator.apache.org/derby/manuals/reference/sqlj66.html
NULLIF is currently not explained at all.
For CASE, the formulation "a built-in broadening conversion must exist" should be extended and the data type of the result should be documented.

Jeremy Boynes added a comment - 28/Dec/04 06:09 PM
The fix seems address the issue with non-CHAR types being used.

However, it exposes a problem with binary comparisons where it is not possible to compare CHAR and INTEGER types:

ij> values(nullif('1', 1));
ERROR 42818: Comparisons between 'CHAR' and 'INTEGER' are not supported.
ij> values (case when '1' = 1 THEN 2 ELSE 3 END);
ERROR 42818: Comparisons between 'CHAR' and 'INTEGER' are not supported.

I believe though that this is a separate issue affecting all comparisons.

Amit Handa added a comment - 10/Jan/05 04:57 PM
Submitting Patch as per discussion.

Satheesh Bandaram added a comment - 11/Jan/05 07:31 AM
Is anyone concerned that the proposed patch also accepts this:

ij> values (case when 1=2 then 5 else '6' end);
1
-----------
6

I think it shouldn't. I have a feeling we are not confirming to SQL standard (ISO/IEC 9075-2:2003) 9.3 section, 'Data types of results of aggregations' here.

Satheesh

Christian d'Heureuse added a comment - 11/Jan/05 08:26 AM
> I have a feeling we are not confirming to SQL standard

Are you referring to the syntax rule that states that all arguments "shall" be character strings, if any argument is character string?

"9075-2 9.3 Data types of results of aggregations
...
Syntax Rules
3. a) If any of the data types in DTS is character string, then all data types in DTS shall be character string, ..."

The term "shall" is defined as:

"9075-1 6.2.3.3 Terms denoting rule requirements
In the Syntax Rules, the term shall defines conditions that are required to be true of syntactically conforming SQL language. ...
The treatment of language that does not conform to the SQL Formats and Syntax Rules is implementation-dependent. ..."

The conclusion is that the expression
 CASE WHEN ... THEN intValue ELSE charValue END
is non-conforming SQL and may be processed in an implementation-dependent manner.

Jeremy Boynes added a comment - 19/Jan/05 12:57 AM
A larger patch that also deals with the use of untyped NULLs in CASE statements. This allows NULLIF to be implemented as (CASE WHEN L=R THEN NULL ELSE L END). Adds a new testcase for various CASE statements and has been tested with the derbylang suite on WinXP/Sun JVM 1.4.2.

This patch also checks that the values returned by the different branches are compatible using the same check as the COALESCE function. Not sure if this is right, but at least it will be consistent. It might be worth exploring if COALESCE can be implemented as a large CASE statement.

It also reuses the SQLState codes for a couple of errors as I am not sure of the policy for changing the associated text or adding new ones.

Mamta A. Satoor added a comment - 11/Jun/05 05:57 AM
I have been working on this bug for a little while and have a patch which I think resolves issues with NULLIF. Please find the patch Derby7NullIf061005svndiff.txt attached to this bug.

Following is a brief description of what the proposed fix involves.
For NULLIF(L,R), parser will generate CASE L=R then untyped NULL else L. Notice that the NULL does not have any type associated with it during parsing. This change went into sqlgrammar.jj
The existing code in ConditionalNode.bindExpression first binds the condition L=R. At the end of L=R condition binding, we will have determined the type of operand L. I have added cod into ConditionalNode to CAST untyped NULL to the type of L. The exisiting code for L=R binding enforces the Reference Guide Table 1. Comparisons allowed by Derby on page 111 and 112. So answers for what datatypes can be combined in NULLIF can be explained by that table.
I have added some tests to see what the return type of NULLIF would be for various combinations of :L and R datatypes. I have also added test cases where the first operand to NULLIF is a parameter.

All the tests pass except one failure below which I don't think is related to my changes
*** Start: miscerrors jdk1.4.2_07 derbyall:derbylang 2005-06-10 01:43:31 ***
35 del
< xxxxxxFILTERED-TIMESTAMPxxxxx|main,5,main |92 |0 | wombat | null |Cleanup action starting
36 del
< xxxxxxFILTERED-TIMESTAMPxxxxx|main,5,main |92 |0 | wombat | null |Failed Statement is: --
37 del
< xxxxxxFILTERED-TIMESTAMPxxxxx|main,5,main |97 |0 | wombat | null |Cleanup action starting
38 del
< xxxxxxFILTERED-TIMESTAMPxxxxx|main,5,main |97 |0 | wombat | null |Failed Statement is: create table a (one int, two int)
39 del
< xxxxxxFILTERED-TIMESTAMPxxxxx|main,5,main |100 |0 | wombat | null |Cleanup action starting
40 del
< xxxxxxFILTERED-TIMESTAMPxxxxx|main,5,main |100 |0 | wombat | null |Failed Statement is: create table a (one int)
40a35,40
> xxxxxxFILTERED-TIMESTAMPxxxxx|main,5,main |91 |0 | wombat | null |Cleanup action starting
> xxxxxxFILTERED-TIMESTAMPxxxxx|main,5,main |91 |0 | wombat | null |Failed Statement is: --
> xxxxxxFILTERED-TIMESTAMPxxxxx|main,5,main |96 |0 | wombat | null |Cleanup action starting
> xxxxxxFILTERED-TIMESTAMPxxxxx|main,5,main |96 |0 | wombat | null |Failed Statement is: create table a (one int, two int)
> xxxxxxFILTERED-TIMESTAMPxxxxx|main,5,main |99 |0 | wombat | null |Cleanup action starting
> xxxxxxFILTERED-TIMESTAMPxxxxx|main,5,main |99 |0 | wombat | null |Failed Statement is: create table a (one int)
Test Failed.
*** End: miscerrors jdk1.4.2_07 derbyall:derbylang 2005-06-10 01:43:48 ***


Output of svn stat
M java\engine\org\apache\derby\impl\sql\compile\ConditionalNode.java
M java\engine\org\apache\derby\impl\sql\compile\sqlgrammar.jj
M java\testing\org\apache\derbyTesting\functionTests\tests\lang\coalesceTests.java
M java\testing\org\apache\derbyTesting\functionTests\tests\jdbcapi\resultset.java
M java\testing\org\apache\derbyTesting\functionTests\master\DerbyNet\resultset.out
M java\testing\org\apache\derbyTesting\functionTests\master\DerbyNetClient\resultset.out
M java\testing\org\apache\derbyTesting\functionTests\master\resultset.out

Mamta A. Satoor added a comment - 03/Aug/05 06:47 AM
Adding a new patch to the bug named derby7Nullif080205svndiff.txt since the old patch has gotten out of sync.

svn stat has remain unchanged since the last patch and the changes are also the same as the last patch.

Satheesh Bandaram added a comment - 04/Aug/05 09:44 AM
I submitted this patch to trunk. I noticed a build.xml file was changed under iapi/types directory that didn't look related to this bug, so I didn't submit that change.

Mamta A. Satoor added a comment - 04/Aug/05 11:50 PM
The build.xml change didn't belong to this patch as Satheesh guessed. Thanks for catching it.

Kristian Waagan added a comment - 23/Jan/06 11:03 PM
I verified that the initial reported error no longer occurs, but the discussion for this issue is somewhat volumnious and several other (potential) problems are mentioned.
Could someone with more knowledge about this field confirm that this issue is fixed and that no more work is to be done on it?
If true, please close the issue.

Mamta A. Satoor added a comment - 27/Jan/06 04:42 PM
Prior to submitting a patch for this issue, I had gone through all the activities on this issue in JIRA and on Derby mailing list. I don't think any outstanding issues are left on it, so I am marking this as closed.

Bernt M. Johnsen added a comment - 04/May/06 05:55 PM
Merge to 10.1 branch

Bernt M. Johnsen added a comment - 04/May/06 05:56 PM
Committed revision 399657 on 10.1 branch

Bernt M. Johnsen added a comment - 04/May/06 05:57 PM
Ooops.... lost the fixed 10.2....

Myrna van Lunteren added a comment - 05/May/06 12:46 AM
Hi Berndt,

I happened to spot the following in your check-in for revision 399657 for the test jdbcapi/resultset.java:

+ } catch(SQLException ex){
+ if (ex.getSQLState().equals("22005")) {
+ if (s.getBytes (i) != null)
+ row.append(new String(s.getBytes(i)));
+ else
+ row.append(s.getBytes(i));
+ } else throw ex;
+ }

I think what has happened here is that you applied original patch of DERBY-7 to 10.1.

However, the test resultset.java has since had some work done on it, and I'm specifically concerned that you may have missed the changes I worked on for DERBY-903, which were committed with revision 378337 and 379643.

This eliminates the use of the constructor 'new String(bytes[]), which is non-portable...It constructs a string by decoding the bytes using the default platform charset. This can lead to encoding related problems. For this case, you should probably use the constructor that takes in the encoding. ie new String(byte[], String charsetname).

Can you please rework this section of resultset.java to not use an encoding-safe mechanism?
Just compare with the current 10.2 resultset.java...

Thx,
Myrna

Myrna van Lunteren added a comment - 05/May/06 04:20 AM
Oops, at least one line in my previous comment is suffering from bad editing...I meant:

"Can you please rework this section of resultset.java to use an encoding-safe mechanism? "


Bernt M. Johnsen added a comment - 05/May/06 03:31 PM
Need to apply changes from DERBY-903 in 10.1 branch

Bernt M. Johnsen added a comment - 05/May/06 07:45 PM
Changes to java/testing/org/apache/derbyTesting/functionTests/tests/jdbcapi/resultset.java as a result of DERBY-7 and DERBY-903 was applied in the oposite order on 10.1 compared to trunk.

Bernt M. Johnsen added a comment - 05/May/06 07:48 PM
Committed revision 400066.