Description
Kudu aims to be part of the Hadoop ecosystem, and other tools in the Hadoop ecosystem store timestamps differently than Kudu. For example:
- Parquet has TIMESTAMP_MILLIS which is milliseconds since the Unix epoch.
- Impala internally stores a
{64-bit nanoseconds since midnight, 32-bit Julian day number}
, and when storing in Parquet, uses Parquet's INT96 type to store this.
- Hive internally uses a 32-bit seconds-since-Unix-epoch, plus an optional nanoseconds component
To avoid adding to the confusion, we should name our time more explicitly (eg UNIX_MICROTIMESTAMP or UNIXTIME_MICROS or somesuch)