As reported in
SPARK-6950, it appears that sometimes the standalone master attempts to build an application's historical UI before closing the app's event log. This is still an issue for us in 1.5.2+, and I believe I've found the underlying cause.
When stopping a SparkContext, the stop method stops the DAG scheduler:
and then stops the event logger:
Though it is difficult to follow the chain of events, one of the sequelae of stopping the DAG scheduler is that the master's rebuildSparkUI method is called. This method looks for the application's event logs, and its behavior varies based on the existence of an .inprogress file suffix. In particular, a warning is logged if this suffix exists:
After calling the stop method on the DAG scheduler, the SparkContext stops the event logger:
This renames the event log, dropping the .inprogress file sequence.
As such, a race condition exists where the master may attempt to process the application log file before finalizing it.