Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-26277

Java docs & implementation of TimestampColumnReader are contradicting

    XMLWordPrintableJSON

Details

    Description

      (Not sure if this should be classified as a bug, but I don't see a more proper type.)

      The Java docs of TimestampColumnReader states that

      /**
       * Timestamp {@link ColumnReader}. We only support INT96 bytes now, julianDay(4) + nanosOfDay(8).
       * See https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#timestamp
       * TIMESTAMP_MILLIS and TIMESTAMP_MICROS are the deprecated ConvertedType.
       */
      

      However the implementation goes like this

      ByteBuffer buffer = readDataBuffer(12);
      column.setTimestamp(
              rowId + i,
              int96ToTimestamp(utcTimestamp, buffer.getLong(), buffer.getInt()));
      

      This implementation contradicts the Java docs because nanosOfDay(8) actually precedes julianDay(4).

      This implementation is also confusing as it relies on the evaluation order of the argument list. Although it is specified in the Java Language Specification that argument lists are evaluated from left to right, it is not true for other languages (for example c++ does not specify this and may evaluate the list in arbitrary order).

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              TsReaper Caizhi Weng
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: