Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.15.0
-
None
-
None
Description
(Not sure if this should be classified as a bug, but I don't see a more proper type.)
The Java docs of TimestampColumnReader states that
/**
* Timestamp {@link ColumnReader}. We only support INT96 bytes now, julianDay(4) + nanosOfDay(8).
* See https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#timestamp
* TIMESTAMP_MILLIS and TIMESTAMP_MICROS are the deprecated ConvertedType.
*/
However the implementation goes like this
ByteBuffer buffer = readDataBuffer(12); column.setTimestamp( rowId + i, int96ToTimestamp(utcTimestamp, buffer.getLong(), buffer.getInt()));
This implementation contradicts the Java docs because nanosOfDay(8) actually precedes julianDay(4).
This implementation is also confusing as it relies on the evaluation order of the argument list. Although it is specified in the Java Language Specification that argument lists are evaluated from left to right, it is not true for other languages (for example c++ does not specify this and may evaluate the list in arbitrary order).
Attachments
Issue Links
- is related to
-
FLINK-25565 Write and Read Parquet INT64 Timestamp
- Resolved