Description
Content of table a:
outgoing_0 |
false |
true |
null |
a has only one field: outgoing_0
Content of table b:
outgoing_1 |
false |
true |
null |
b has only one filed: outgoing_1
After running this query:
select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)
I got the following result:
outgoing_0 | outgoing_1 |
true | true |
false | false |
false | null |
null | null |
The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. The operator <=> should match null with null.
While left "false" is matched with right "null", it is also strange to find that the "false" on the right table does not match with "null" on the left table (no row with "null" as outgoing_0 and "false" as outgoing_1)
You can easily reproduce this bug by pasting the following code fragment:
case class A( outgoing_0: Option[Boolean] ) case class B( outgoing_1: Option[Boolean] )
val a = sc.parallelize( Seq( A( Some( false ) ), A( Some( true ) ), A( None ) ) ).toDF() a.show val b = sc.parallelize( Seq( B( Some( false ) ), B( Some( true ) ), B( None ) ) ).toDF() b.show a.registerTempTable( "a" ) b.registerTempTable( "b" ) sqlContext.sql( "select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)" ).show()