Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 2.6.0, Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0
-
ghx-label-4
Description
This can happen because we hash the binary representation of the numbers. -0/0 should be treated as equal for hash joins.
[localhost:21000] > select * from (select cast("-0" as float) c1) v1, (select cast("0" as float) c2) v2 where v1.c1 = v2.c2; Fetched 0 row(s) in 0.12s [localhost:21000] > select * from (select cast("0" as float) c1) v1, (select cast("0" as float) c2) v2 where v1.c1 = v2.c2; +----+----+ | c1 | c2 | +----+----+ | 0 | 0 | +----+----+ Fetched 1 row(s) in 0.11s [localhost:21000] > select * from (select cast("-0" as float) c1) v1, (select cast("-0" as float) c2) v2 where v1.c1 = v2.c2; +----+----+ | c1 | c2 | +----+----+ | -0 | -0 | +----+----+ Fetched 1 row(s) in 0.11s
With aggregations, we get separate groups. I could see the argument either way on whether this is the preferred behaviour for group by, since group by already handles equality of NULL differently. The behaviour here is tied to the behaviour in the join right now, so we should make sure to add a test for this case when fixing the join.
[localhost:21000] > select distinct * from (values(cast("-0" as float)), (cast("0" as float))) v; +---------------------+ | cast('-0' as float) | +---------------------+ | -0 | | 0 | +---------------------+
Workaround
Casting the floating point numbers to decimal fixes the problem.
Proposed solution
The frontend could wrap floating point expressions in the hash join or hash aggregation in a normalisation function that converts -0 to +0.
Attachments
Issue Links
- is duplicated by
-
IMPALA-1543 Positive and negative zero floats hash and compare as unequal, although they should be equal.
- Resolved
- is related to
-
IMPALA-6661 Group by float results in one group per NaN value
- Resolved