Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23754

StopIterator exception in Python UDF results in partial result

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 2.3.0
    • 2.3.1, 2.4.0
    • PySpark
    • None

    Description

      Reproduce:

      df = spark.range(0, 1000)
      from pyspark.sql.functions import udf
      
      def foo(x):
          raise StopIteration()
      
      df.withColumn('v', udf(foo)).show()
      
      # Results
      # +---+---+
      # | id|  v|
      # +---+---+
      # +---+---+

      I think the task should fail in this case

      Attachments

        Activity

          People

            edorigatti Emilio Dorigatti
            icexelloss Li Jin
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: