Details
-
Task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
Impala 2.7.0
-
None
Description
Some of the issues flagged by the random query generator involve discrepancies cited where postgres and impala rows have columns with floating point numbers that differ.
The query generator uses a method to report whether the numbers are equivalent:
def floats_are_equal(self, ref, test): ref = round(ref, 2) test = round(test, 2) diff = abs(ref - test) if ref * test == 0: return diff < 0.1 result = diff / (abs(ref) + abs(test)) < 0.1 if not result: LOG.debug("Floats differ, diff: %s, |reference|: %s, |test|: %s", diff, abs(ref), abs(test)) return result
However, this fails in some cases, notably with:
>>> q.floats_are_equal(0.035, 0.034999999999999996) False >>>
The problem in this case is the early rounding: ref becomes 0.04, and test is 0.03.
Python 3 has a method called isclose() (see https://docs.python.org/3.5/library/math.html#math.isclose), and https://www.python.org/dev/peps/pep-0485/ has more information, including a code snippet that could possibly used in Python 2.7 (the PEP copyright is public domain)
Postgres has a setting called extra_float_digits that could be modified to show a more accurate representation of the floating point number; its default is set to show the same result across all platforms. It's for this reason the value for ref in the example above is 0.035.
Last, according to documentation, Impala, Kudu, and Postgres all implement floats using IEEE 754.