Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-35196

DataFrameWriter.text support zstd compression

    XMLWordPrintableJSON

Details

    • Task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.1.1
    • None
    • PySpark
    • None

    Description

      http://spark.apache.org/docs/3.1.1/api/python/reference/api/pyspark.sql.DataFrameWriter.text.html specifies that only the following compression codecs are supported: `none, bzip2, gzip, lz4, snappy and deflate`

      However, RDD API supports compression with zstd if users specify 'org.apache.hadoop.io.compress.ZStandardCodec' compressor in the saveAsTextFile method.

      Please also expose zstd in the DataFrameWriter.

      Attachments

        Activity

          People

            Unassigned Unassigned
            lausen Leonard Lausen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: