Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-41279 Feature parity: DataFrame API in Spark Connect
  3. SPARK-42367

DataFrame.drop should handle duplicated columns properly

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.4.0
    • Connect, PySpark
    • None

    Description

      >>> df.join(df2, df.name == df2.name, 'inner').show()
      +---+----+------+----+
      |age|name|height|name|
      +---+----+------+----+
      | 16| Bob|    85| Bob|
      | 14| Tom|    80| Tom|
      +---+----+------+----+
      
      >>> df.join(df2, df.name == df2.name, 'inner').drop('name').show()
      +---+------+
      |age|height|
      +---+------+
      | 16|    85|
      | 14|    80|
      +---+------+
      
      

      Attachments

        Activity

          People

            podongfeng Ruifeng Zheng
            podongfeng Ruifeng Zheng
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: