Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-28048

pyspark.sql.functions.explode will abondon the row which has a empty list column when applied to the column

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • 2.1.1
    • None
    • PySpark
    • None

    Description

      from pyspark.sql import Row
      from pyspark.sql.functions import explode
      eDF = spark.createDataFrame([Row(a=1, intlist=[1,2], mapfield={"a": "b"}), Row(a=2, intlist=[], mapfield={"a": "b"})])
      eDF = eDF.withColumn('another', explode(eDF.intlist)).collect()
      eDF
      

      The `a=2` row is missing in the output

      Attachments

        Activity

          People

            Unassigned Unassigned
            MaxInMin Ma Xinmin
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: