[PARQUET-1222] Specify a well-defined sorting order for float and double types - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: None
Fix Version/s: format-2.10.0
Component/s: parquet-format
Labels:
None

Description

Currently parquet-format specifies the sort order for floating point numbers as follows:

   *   FLOAT - signed comparison of the represented value
   *   DOUBLE - signed comparison of the represented value

The problem is that the comparison of floating point numbers is only a partial ordering with strange behaviour in specific corner cases. For example, according to IEEE 754, -0 is neither less nor more than +0 and comparing NaN to anything always returns false. This ordering is not suitable for statistics. Additionally, the Java implementation already uses a different (total) ordering that handles these cases correctly but differently than the C++ implementations, which leads to interoperability problems.

TypeDefinedOrder for doubles and floats should be deprecated and a new TotalFloatingPointOrder should be introduced. The default for writing doubles and floats would be the new TotalFloatingPointOrder. This ordering should be effective and easy to implement in all programming languages.

Attachments

Issue Links

Blocked

IMPALA-7304 Impala shouldn't write column indexes for float columns until PARQUET-1222 is resolved

Resolved

blocks

IMPALA-6539 Implement specification-compliant floating point comparison

Open

PARQUET-1223 [parquet-mr] Implement specification-compliant floating point comparison

Open

PARQUET-1224 [C++] Implement specification-compliant floating point comparison

Resolved

is related to

ARROW-12264 [C++][Dataset] Handle NaNs correctly in Parquet predicate push-down

In Progress

PARQUET-1246 Ignore float/double statistics in case of NaN

Resolved

PARQUET-1251 Clarify ambiguous min/max stats for FLOAT/DOUBLE

Closed

(2 is related to)

Activity

People

Assignee:: Micah Kornfield

Reporter:: Zoltan Ivanfi

Votes:: 0 Vote for this issue

Watchers:: 13 Start watching this issue

Dates

Created:: 19/Feb/18 14:12

Updated:: 23/Jun/24 03:30

Resolved:: 07/Dec/22 14:10