Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Duplicate
-
2.1.0
-
None
-
None
-
macOS Sierra
-
Important
Description
Let say we have a csv /tmp/1.csv :
cid,name -100224910923912596,jack -100224910923912595,tom -1,rose -2,marry -100,rose1 -101,rose2
Use following SQL to define a view in Spark-SQL:
CREATE TEMPORARY VIEW T
(
`cid` string,
`name` string
)
USING CSV
OPTIONS (
path "/tmp/1.csv"
);
Statement 1:
select * from T where cid = -100224910923912596;
Returns:
-100224910923912596 jack -100224910923912595 tom
Statement 2:
select * from T where cid = -100224910923912599;
it also returns:
-100224910923912596 jack -100224910923912595 tom
Unless you do,
select * from T where cid = '-100224910923912596';
It returns:
-100224910923912596 jack
However, i think the expected behaviour for statement 1 and 2 is pretty wired.
Statement 4
select * from T where cid = -100;
Returns:
-100 rose1
And this just affect the large number, the smaller one seemed to be good.
Does that look like a bug to you folks ?
Thanks.
Attachments
Issue Links
- duplicates
-
SPARK-17913 Filter/join expressions can return incorrect results when comparing strings to longs
- Resolved