Details
-
Umbrella
-
Status: Resolved
-
Major
-
Resolution: Resolved
-
3.4.0
-
None
-
None
Description
pandas API on Spark aims to make pandas code work on Spark clusters without any changes. So full API coverage has been one of our major goals. Currently, most pandas functions are implemented, whereas some of them are have incomplete parameters support.
There are some common parameters missing (resolved):
* How to do with NAs
* Filter data types
* Control result length
* Reindex result
There are remaining missing parameters to implement (see doc below).
See the design and the current status at https://docs.google.com/document/d/1H6RXL6oc-v8qLJbwKl6OEqBjRuMZaXcTYmrZb9yNm5I/edit?usp=sharing.