[SPARK-11894] Incorrect results are returned when using null - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 1.6.0
Fix Version/s: 1.6.0
Component/s: SQL
Labels:
None

Target Version/s:

1.6.0

Description

In DataSet APIs, the following two datasets are the same.
Seq((new java.lang.Integer(0), "1"), (new java.lang.Integer(22), "2")).toDS()
Seq((null.asInstanceOf[java.lang.Integer],, "1"), (new java.lang.Integer(22), "2")).toDS()

Note: java.lang.Integer is Nullable.

It could generate an incorrect result. For example,

val ds1 = Seq((null.asInstanceOf[java.lang.Integer], "1"), (new java.lang.Integer(22), "2")).toDS()
val ds2 = Seq((null.asInstanceOf[java.lang.Integer], "1"), (new java.lang.Integer(22), "2")).toDS()//toDF("key", "value").as('df2)

val res1 = ds1.joinWith(ds2, lit(true)).collect()

The expected result should be
((null,1),(null,1))
((22,2),(null,1))
((null,1),(22,2))
((22,2),(22,2))

The actual result is
((0,1),(0,1))
((22,2),(0,1))
((0,1),(22,2))
((22,2),(22,2))

Attachments

Issue Links

links to

[Github] Pull Request #9904 (cloud-fan)

Activity

People

Assignee:: Wenchen Fan

Reporter:: Xiao Li

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 20/Nov/15 22:36

Updated:: 23/Nov/15 22:51

Resolved:: 23/Nov/15 18:14