There are 4 different compression codec available for ParquetOutputFormat
in Spark SQL it was set as a hard-coded value in
so we need to add a new config property in SQLConf to allow user to change this compression codec, and i used similar short names syntax as described in
btw, which codec should we use as default? it was set to GZIP (https://github.com/apache/spark/pull/195/files#diff-4), but i think maybe we should change this to SNAPPY, since SNAPPY is already the default codec for shuffling in spark-core (
SPARK-2469), and parquet-mr supports Snappy codec natively.