Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-13601

Invoke task failure callbacks before calling outputstream.close()

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.6.2, 2.0.0
    • Component/s: Spark Core
    • Labels:
      None

      Description

      We need to submit another PR against Spark to call the task failure callbacks before Spark calls the close function on various output streams.
      For example, we need to intercept an exception and call TaskContext.markTaskFailed before calling close in the following code (in PairRDDFunctions.scala):

            Utils.tryWithSafeFinally {
              while (iter.hasNext) {
                val record = iter.next()
                writer.write(record._1.asInstanceOf[AnyRef], record._2.asInstanceOf[AnyRef])
      
                // Update bytes written metric every few records
                maybeUpdateOutputMetrics(outputMetricsAndBytesWrittenCallback, recordsWritten)
                recordsWritten += 1
              }
            } {
              writer.close()
            }
      

      Changes to Spark should include unit tests to make sure this always work in the future.

        Attachments

          Activity

            People

            • Assignee:
              davies Davies Liu
              Reporter:
              davies Davies Liu
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: