Description
Impala is implementing IS NULL/IS NOT NULL Kudu predicates as part of IMPALA-4859 (see review: https://gerrit.cloudera.org/#/c/5958/ ). In testing, Kudu IS NULL is eliminating valid NULL values from results returned.
Here is an example:
select id, float_col from functional_kudu.alltypesagg where id < 10;
id | float_col |
3 | 3.299999952316284 |
7 | 7.699999809265137 |
0 | NULL |
6 | 6.599999904632568 |
8 | 8.800000190734863 |
9 | 9.899999618530273 |
0 | NULL |
1 | 1.100000023841858 |
2 | 2.200000047683716 |
4 | 4.400000095367432 |
5 | 5.5 |
Fetched 11 row(s) in 0.57s
When adding an IS NULL condition on float_col, this does not return any rows.
select id, float_col from functional_kudu.alltypesagg where id < 10 and float_col is null;
Fetched 0 row(s) in 0.25s
This is also true for other tables, such as functional_kudu.nulltable.
select * from functional_kudu.nulltable;
a | b | c | d | e | f | g |
a | NULL | NULL | NULL | ab |
Fetched 1 row(s) in 0.49s
The following SQLs return no rows:
select * from functional_kudu.nulltable where c is null;
select * from functional_kudu.nulltable where d is null;
select * from functional_kudu.nulltable where e is null;
Impala statistics indicate that Kudu is not returning any rows. IS NOT NULL seems to work correctly.