Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-12755

Spark may attempt to rebuild application UI before finishing writing the event logs in possible race condition

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.5.2
    • Fix Version/s: 1.5.3, 1.6.1, 2.0.0
    • Component/s: Spark Core
    • Labels:
      None

      Description

      As reported in SPARK-6950, it appears that sometimes the standalone master attempts to build an application's historical UI before closing the app's event log. This is still an issue for us in 1.5.2+, and I believe I've found the underlying cause.

      When stopping a SparkContext, the stop method stops the DAG scheduler:

      https://github.com/apache/spark/blob/a76cf51ed91d99c88f301ec85f3cda1288bcf346/core/src/main/scala/org/apache/spark/SparkContext.scala#L1722-L1727

      and then stops the event logger:

      https://github.com/apache/spark/blob/a76cf51ed91d99c88f301ec85f3cda1288bcf346/core/src/main/scala/org/apache/spark/SparkContext.scala#L1722-L1727

      Though it is difficult to follow the chain of events, one of the sequelae of stopping the DAG scheduler is that the master's rebuildSparkUI method is called. This method looks for the application's event logs, and its behavior varies based on the existence of an .inprogress file suffix. In particular, a warning is logged if this suffix exists:

      https://github.com/apache/spark/blob/a76cf51ed91d99c88f301ec85f3cda1288bcf346/core/src/main/scala/org/apache/spark/deploy/master/Master.scala#L935

      After calling the stop method on the DAG scheduler, the SparkContext stops the event logger:

      https://github.com/apache/spark/blob/a76cf51ed91d99c88f301ec85f3cda1288bcf346/core/src/main/scala/org/apache/spark/SparkContext.scala#L1734-L1736

      This renames the event log, dropping the .inprogress file sequence.

      As such, a race condition exists where the master may attempt to process the application log file before finalizing it.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                michael Michael Allman
                Reporter:
                michael Michael Allman
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: