[SPARK-35196] DataFrameWriter.text support zstd compression - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Task
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 3.1.1
Fix Version/s: None
Component/s: PySpark
Labels:
None

Description

http://spark.apache.org/docs/3.1.1/api/python/reference/api/pyspark.sql.DataFrameWriter.text.html specifies that only the following compression codecs are supported: `none, bzip2, gzip, lz4, snappy and deflate`

However, RDD API supports compression with zstd if users specify 'org.apache.hadoop.io.compress.ZStandardCodec' compressor in the saveAsTextFile method.

Please also expose zstd in the DataFrameWriter.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Leonard Lausen

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 22/Apr/21 19:57

Updated:: 12/Dec/22 18:11