Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-8038

PySpark SQL when functions is broken on Column

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 1.4.0
    • Fix Version/s: 1.4.0
    • Component/s: PySpark, SQL
    • Labels:
      None
    • Environment:

      Spark 1.4.0 RC3

      Description

      In [1]: df = sqlCtx.createDataFrame([(1, "1"), (2, "2"), (1, "2"), (1, "2")], ["key", "value"])
      
      
      In [2]: from pyspark.sql import functions as F
      
      In [8]: df.select(df.key, F.when(df.key > 1, 0).when(df.key == 0, 2).otherwise(1)).show()
      
      +---+---------------------------------+
      | key |CASE WHEN (key = 0) THEN 2 ELSE 1|
      +---+---------------------------------+
      | 1| 1|
      | 2| 1|
      | 1| 1|
      | 1| 1|
      +---+---------------------------------+
      

      When in Scala I get the expected expression and behaviour :

      scala> val df = sqlContext.createDataFrame(List((1, "1"), (2, "2"), (1, "2"), (1, "2"))).toDF("key", "value")
      
      scala> import org.apache.spark.sql.functions._
      
      scala> df.select(df("key"), when(df("key") > 1, 0).when(df("key") === 2, 2).otherwise(1)).show()
      
      +---+-------------------------------------------------------+
      
      |key|CASE WHEN (key > 1) THEN 0 WHEN (key = 2) THEN 2 ELSE 1|
      +---+-------------------------------------------------------+
      | 1| 1|
      | 2| 0|
      | 1| 1|
      | 1| 1|
      +---+-------------------------------------------------------+
      

      This is coming from the "column.py" file with the Column class definition of *when* and the fix is coming.

        Attachments

          Activity

            People

            • Assignee:
              ogirardot Olivier Girardot
              Reporter:
              ogirardot Olivier Girardot

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment