Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21022

RDD.foreach swallows exceptions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Invalid
    • 2.1.1
    • None
    • Spark Core
    • None

    Description

      A `RDD.foreach` or `RDD.foreachPartition` call will swallow Exceptions thrown inside its closure, but not if the exception was thrown earlier in the call chain. An example:

       package examples
      
       import org.apache.spark._
      
       object Shpark {
         def main(args: Array[String]) {
           implicit val sc: SparkContext = new SparkContext(
             new SparkConf().setMaster("local[*]").setAppName("blahfoobar")
           )
      
           /* DOESN'T THROW                                                                                     
           sc.parallelize(0 until 10000000)                                                                     
             .foreachPartition { _.map { i =>                                                                   
               println("BEFORE THROW")                                                                          
               throw new Exception("Testing exception handling")                                                
               println(i)                                                                                       
             }}                                                                                                 
            */
      
           /* DOESN'T THROW, nor does anything print.                                                           
            * Commenting out the exception runs the prints.                                                     
            * (i.e. `foreach` is sufficient to "run" an RDD)                                                    
           sc.parallelize(0 until 100000)                                                                       
             .foreach({ i =>                                                                                    
               println("BEFORE THROW")                                                                          
               throw new Exception("Testing exception handling")                                                
               println(i)                                                                                       
             })                                                                                                 
            */
      
           /* Throws! */
           sc.parallelize(0 until 100000)
             .map({ i =>
               println("BEFORE THROW")
               throw new Exception("Testing exception handling")
               i
             })
             .foreach(i => println(i))
      
           println("JOB DONE!")
      
           System.in.read
      
           sc.stop()
         }
       }
      

      When exceptions are swallowed, the jobs don't seem to fail, and the driver exits normally. When one is thrown, as in the last example, the exception successfully rises up to the driver and can be caught with try/catch.

      The expected behaviour is for exceptions in `foreach` to throw and crash the driver, as they would with `map`.

      Attachments

        Activity

          People

            zsxwing Shixiong Zhu
            fosskers Colin Woodbury
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: