Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-24322

Upgrade Apache ORC to 1.4.4

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.4.0
    • 2.3.1, 2.4.0
    • Build

    Description

      ORC 1.4.4 includes nine fixes. One of the issues is about `Timestamp` bug (ORC-306) which occurs when `native` ORC vectorized reader reads ORC column vector's sub-vector `times` and `nanos`. ORC-306 fixes this according to the original definition and the linked PR includes the updated interpretation on ORC column vectors. Note that `hive` ORC reader and ORC MR reader is not affected.

      scala> spark.version
      res0: String = 2.3.0
      scala> spark.sql("set spark.sql.orc.impl=native")
      scala> Seq(java.sql.Timestamp.valueOf("1900-05-05 12:34:56.000789")).toDF().write.orc("/tmp/orc")
      scala> spark.read.orc("/tmp/orc").show(false)
      +--------------------------+
      |value                     |
      +--------------------------+
      |1900-05-05 12:34:55.000789|
      +--------------------------+
      

      This issue aims to update Apache Spark to use it.

      FULL LIST

      ID TITLE
      ORC-281 Fix compiler warnings from clang 5.0
      ORC-301 `extractFileTail` should open a file in `try` statement
      ORC-304 Fix TestRecordReaderImpl to not fail with new storage-api
      ORC-306 Fix incorrect workaround for bug in java.sql.Timestamp
      ORC-324 Add support for ARM and PPC arch
      ORC-330 Remove unnecessary Hive artifacts from root pom
      ORC-332 Add syntax version to orc_proto.proto
      ORC-336 Remove avro and parquet dependency management entries
      ORC-360 Implement error checking on subtype fields in Java

      Attachments

        Issue Links

          Activity

            People

              dongjoon Dongjoon Hyun
              dongjoon Dongjoon Hyun
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: