Details
-
Umbrella
-
Status: Resolved
-
Blocker
-
Resolution: Done
-
3.2.0
-
None
-
None
Description
There are several things that need improvement in pandas on Spark.
Attachments
Issue Links
- is part of
-
SPARK-34849 SPIP: Support pandas API layer on PySpark
- Resolved
1.
|
Mapping the `mode` argument to pandas in DataFrame.to_csv | Resolved | Haejoon Lee | |
2.
|
Deprecate the `num_files` argument | Resolved | Haejoon Lee | |
3.
|
Always enable the `pandas_metadata` in DataFrame.parquet | Resolved | Unassigned | |
4.
|
Add `index_col` argument for ps.sql. | Resolved | Haejoon Lee | |
5.
|
Deprecate ps.broadcast API | Resolved | Haejoon Lee | |
6.
|
Deprecate DataFrame.to_spark_io | Resolved | Kevin Su | |
7.
|
Throw an error if `version` and `timestamp` are used together in DataFrame.to_delta. | Resolved | Yikun Jiang | |
8.
|
Remove some APIs from documentation. | Resolved | Haejoon Lee | |
9.
|
Install mlflow/sklearn in Github Actions CI | Resolved | Haejoon Lee |