Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-42883

Implement Pandas API Missing Parameters

    XMLWordPrintableJSON

Details

    • Umbrella
    • Status: Resolved
    • Major
    • Resolution: Resolved
    • 3.4.0
    • None
    • Pandas API on Spark
    • None

    Description

      pandas API on Spark aims to make pandas code work on Spark clusters without any changes. So full API coverage has been one of our major goals. Currently, most pandas functions are implemented, whereas some of them are have incomplete parameters support.

      There are some common parameters missing (resolved):
       * How to do with NAs   
       * Filter data types    
       * Control result length    
       * Reindex result   

      There are remaining missing parameters to implement (see doc below).

      See the design and the current status at https://docs.google.com/document/d/1H6RXL6oc-v8qLJbwKl6OEqBjRuMZaXcTYmrZb9yNm5I/edit?usp=sharing.

      Attachments

        There are no Sub-Tasks for this issue.

        Activity

          People

            XinrongM Xinrong Meng
            XinrongM Xinrong Meng
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: