Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-47997

Pandas-on-Spark incompletely implements DataFrame.drop

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 3.4.3
    • None
    • Pandas API on Spark
    • None

    Description

      For Pandas v1.0+, `drop` supports the `errors` kwarg:

      https://pandas.pydata.org/pandas-docs/version/1.0/reference/api/pandas.DataFrame.drop.html

       

      Pandas-on-Spark does not implement it. This is especially glaring since the pyspark drop is a no-op on absent columns, behaving like `errors='ignore'`, so extra work needed to be done to implement the raise behaviour.

      Attachments

        Activity

          People

            Unassigned Unassigned
            tigerhawkvok Philip Kahn
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: