Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-31885

Incorrect filtering of old millis timestamp in parquet

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0, 3.1.0
    • 3.0.0
    • SQL
    • None

    Description

      Welcome to
            ____              __
           / __/__  ___ _____/ /__
          _\ \/ _ \/ _ `/ __/  '_/
         /___/ .__/\_,_/_/ /_/\_\   version 3.1.0-SNAPSHOT
            /_/
      
      Using Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_242)
      Type in expressions to have them evaluated.
      Type :help for more information.
      
      scala> spark.conf.set("spark.sql.parquet.outputTimestampType", "TIMESTAMP_MILLIS")
      scala> spark.conf.set("spark.sql.legacy.parquet.datetimeRebaseModeInWrite", "CORRECTED")
      scala> Seq(java.sql.Timestamp.valueOf("1000-06-14 08:28:53.123")).toDF("ts").write.mode("overwrite").parquet("/Users/maximgekk/tmp/ts_millis_old_filter")
      
      scala> spark.read.parquet("/Users/maximgekk/tmp/ts_millis_old_filter").show(false)
      +-----------------------+
      |ts                     |
      +-----------------------+
      |1000-06-14 08:28:53.123|
      +-----------------------+
      
      
      scala> spark.read.parquet("/Users/maximgekk/tmp/ts_millis_old_filter").filter($"ts" === "1000-06-14 08:28:53.123")
      res6: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [ts: timestamp]
      
      scala> spark.read.parquet("/Users/maximgekk/tmp/ts_millis_old_filter").filter($"ts" === "1000-06-14 08:28:53.123").show(false)
      +---+
      |ts |
      +---+
      +---+
      

      Attachments

        Activity

          People

            apachespark Apache Spark
            maxgekk Max Gekk
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: