Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21786

The 'spark.sql.parquet.compression.codec' configuration doesn't take effect on tables with partition field(s)

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.2.0
    • Fix Version/s: 2.3.0
    • Component/s: SQL
    • Labels:
      None

      Description

      Since Hive 1.1, Hive allows users to set parquet compression codec via table-level properties parquet.compression. See the JIRA: https://issues.apache.org/jira/browse/HIVE-7858 . We do support orc.compression for ORC. Thus, for external users, it is more straightforward to support both. See the stackflow question: https://stackoverflow.com/questions/36941122/spark-sql-ignores-parquet-compression-propertie-specified-in-tblproperties
      In Spark side, our table-level compression conf compression was added by #11464 since Spark 2.0.
      We need to support both table-level conf. Users might also use session-level conf spark.sql.parquet.compression.codec. The priority rule will be like
      If other compression codec configuration was found through hive or parquet, the precedence would be compression, parquet.compression, spark.sql.parquet.compression.codec. Acceptable values include: none, uncompressed, snappy, gzip, lzo.
      The rule for Parquet is consistent with the ORC after the change.

      Changes:
      1.Increased acquiring 'compressionCodecClassName' from parquet.compression,and the precedence order is compression,parquet.compression,spark.sql.parquet.compression.codec, just like what we do in OrcOptions.

      2.Change spark.sql.parquet.compression.codec to support "none".Actually in ParquetOptions,we do support "none" as equivalent to "uncompressed", but it does not allowed to configured to "none".

        Attachments

          Activity

            People

            • Assignee:
              Jinhua Fu Jinhua Fu
              Reporter:
              Jinhua Fu Jinhua Fu
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: