Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-24576

Upgrade Apache ORC to 1.5.2

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.4.0
    • 2.4.0
    • Build
    • None

    Description

      This issue aims to upgrade Apache ORC library from 1.4.4 to 1.5.1 in order to bring the following benefits into Apache Spark.

      • ORC-91 Support for variable length blocks in HDFS (The current space wasted in ORC to padding is known to be 5%.)
      • ORC-344 Support for using Decimal64ColumnVector

      In addition to that, Apache Hive 3.1.0 and 3.2.0 will use ORC 1.5.1 (HIVE-19669) and 1.5.2 (HIVE-19792) respectively. This will improve the compatibility between Apache Spark and Apache Hive.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            dongjoon Dongjoon Hyun
            dongjoon Dongjoon Hyun
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment