Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-14468

Always enable OutputCommitCoordinator

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.4.2, 1.5.2, 1.6.2, 2.0.0
    • Spark Core
    • None

    Description

      The OutputCommitCoordinator was originally introduced in SPARK-4879 because speculation causes the output of some partitions to be deleted. However, as we can see in SPARK-10063, speculation is not the only case where this can happen.

      More specifically, when we retry a stage we're not guaranteed to kill the tasks that are still running (we don't even interrupt their threads), so we may end up with multiple concurrent task attempts for the same task. This leads to problems like SPARK-8029, but this fix alone is necessary but not sufficient.

      In general, when we run into situations like these, we need the OutputCommitCoordinator because we don't control what the underlying file system does. Enabling this doesn't induce heavy performance costs so there's little reason why we shouldn't always enable it to ensure correctness.

      Attachments

        Activity

          People

            andrewor14 Andrew Or
            andrewor14 Andrew Or
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: