Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.9.1-incubating
    • Component/s: None
    • Labels:
      None

      Description

      Write a test to check that NOT IN evaluates to unknown, and therefore returns all rows, if the sub-query returns a null.

      Fix if necessary. Also test composite keys.

      1. OPTIQ-373.1.patch
        2 kB
        Harish Butani

        Activity

        Show
        julianhyde Julian Hyde added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/incubator-optiq/commit/793e5c4d .
        Hide
        julianhyde Julian Hyde added a comment -

        Here is a rewrite that seems to produce the same values as "IN" in all cases.

        # original query using IN
        select e.deptno,
          e.deptno IN (select deptno from v)
        from e;
        
        # rewritten to use just (outer) joins and 2-valued logic
        select e.deptno,
          case
          when ct.c = 0 then false
          when dt.i is not null then true
          when e.deptno is null or ct.ck < ct.c then null
          else false
          end
        from e
        cross join (select count(*) as c, count(deptno) as ck from v) as ct
        left join (select distinct deptno, true as i from v) as dt on e.deptno = dt.deptno;
        
        Show
        julianhyde Julian Hyde added a comment - Here is a rewrite that seems to produce the same values as "IN" in all cases. # original query using IN select e.deptno, e.deptno IN (select deptno from v) from e; # rewritten to use just ( outer ) joins and 2-valued logic select e.deptno, case when ct.c = 0 then false when dt.i is not null then true when e.deptno is null or ct.ck < ct.c then null else false end from e cross join (select count(*) as c, count(deptno) as ck from v) as ct left join (select distinct deptno, true as i from v) as dt on e.deptno = dt.deptno;
        Hide
        julianhyde Julian Hyde added a comment - - edited

        And another:

        create table e(deptno int, ename varchar(5));
        insert into e values (10, 'Alice');
        insert into e values (null, 'Bob');
        create table d(deptno int, dname varchar(10));
        insert into d values (10, 'Sales');
        insert into d values (20, 'Marketing');
        insert into d values (null, 'Null');
        select * from d where deptno not in (select deptno from e);
         deptno | dname
        --------+-------
        (0 rows)
        
        select deptno,
          deptno not in (select deptno from e)
        from d;
        
         deptno | ?column?
        --------+----------
             10 | FALSE
             20 | UNKNOWN
           NULL | UNKNOWN
        
        (3 rows)
        
        Show
        julianhyde Julian Hyde added a comment - - edited And another: create table e(deptno int , ename varchar(5)); insert into e values (10, 'Alice'); insert into e values ( null , 'Bob'); create table d(deptno int , dname varchar(10)); insert into d values (10, 'Sales'); insert into d values (20, 'Marketing'); insert into d values ( null , 'Null'); select * from d where deptno not in (select deptno from e); deptno | dname --------+------- (0 rows) select deptno, deptno not in (select deptno from e) from d; deptno | ?column? --------+---------- 10 | FALSE 20 | UNKNOWN NULL | UNKNOWN (3 rows)
        Hide
        julianhyde Julian Hyde added a comment - - edited

        The test has proved that we do have a problem. Here is another query, run on postgres, that indicates when the sub-query returns nulls and false:

        with
        t1(x) as (select * from  (values (1),(2), (case when 1 = 1 then null else 3 end)) as t(x)),
        t2(x) as (select * from  (values (1),(case when 1 = 1 then null else 3 end)) as t(x))
        select t1.x, (t1.x not in (select t2.x from t2)) from t1;
           x | ?column? 
        -----+----------
           1 | FALSE
           2 | UNKNOWN
        NULL | UNKNOWN
        (3 rows)
        
        Show
        julianhyde Julian Hyde added a comment - - edited The test has proved that we do have a problem. Here is another query, run on postgres, that indicates when the sub-query returns nulls and false: with t1(x) as (select * from (values (1),(2), ( case when 1 = 1 then null else 3 end)) as t(x)), t2(x) as (select * from (values (1),( case when 1 = 1 then null else 3 end)) as t(x)) select t1.x, (t1.x not in (select t2.x from t2)) from t1; x | ?column? -----+---------- 1 | FALSE 2 | UNKNOWN NULL | UNKNOWN (3 rows)
        Hide
        julianhyde Julian Hyde added a comment -

        Note that the SQL standard does not allow "bare" nulls because their type cannot be deduced. (It does allow 'insert into ... select null ...'.) Your case workaround works; also note that 'cast(null as <type>)' works fine.

        Show
        julianhyde Julian Hyde added a comment - Note that the SQL standard does not allow "bare" nulls because their type cannot be deduced. (It does allow 'insert into ... select null ...'.) Your case workaround works; also note that 'cast(null as <type>)' works fine.
        Hide
        rhbutani Harish Butani added a comment -

        Test case is in attached patch file.

        Show
        rhbutani Harish Butani added a comment - Test case is in attached patch file.

          People

          • Assignee:
            julianhyde Julian Hyde
            Reporter:
            julianhyde Julian Hyde
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development