Details
-
Bug
-
Status: Resolved
-
P2
-
Resolution: Fixed
-
2.33.0
-
None
Description
After upgrade from Beam 2.24.0 -> 2.33.0 Spark jobs run on YARN after complete shows empty data on History server.
The problem seems to be a race condition and 2
22/02/22 10:51:11 INFO EventLoggingListener: Logging events to hdfs:/user/spark/jobhistory/application_1553109013416_12079975_1 ... 22/02/22 10:51:41 INFO EventLoggingListener: Logging events to hdfs:/user/spark/jobhistory/application_1553109013416_12079975_1
At the end failure:
22/02/22 11:17:57 INFO SchedulerExtensionServices: Stopping SchedulerExtensionServices (serviceOption=None, services=List(), started=false) 22/02/22 11:17:57 ERROR Utils: Uncaught exception in thread Driver java.io.IOException: Target log file already exists (hdfs:/user/spark/jobhistory/application_1553109013416_12079975_1) at org.apache.spark.scheduler.EventLoggingListener.stop(EventLoggingListener.scala:255) at org.apache.spark.SparkContext.$anonfun$stop$13(SparkContext.scala:1960) at org.apache.spark.SparkContext.$anonfun$stop$13$adapted(SparkContext.scala:1960) at scala.Option.foreach(Option.scala:274) at org.apache.spark.SparkContext.$anonfun$stop$12(SparkContext.scala:1960) at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1340) at org.apache.spark.SparkContext.stop(SparkContext.scala:1960) at org.apache.spark.api.java.JavaSparkContext.stop(JavaSparkContext.scala:654) at org.apache.beam.runners.spark.translation.SparkContextFactory.stopSparkContext(SparkContextFactory.java:73) at org.apache.beam.runners.spark.SparkPipelineResult$BatchMode.stop(SparkPipelineResult.java:133) at org.apache.beam.runners.spark.SparkPipelineResult.offerNewState(SparkPipelineResult.java:234) at org.apache.beam.runners.spark.SparkPipelineResult.waitUntilFinish(SparkPipelineResult.java:99) at org.apache.beam.runners.spark.SparkPipelineResult.waitUntilFinish(SparkPipelineResult.java:92) at com.sizmek.dp.dsp.pipeline.driver.PipelineDriver.$anonfun$main$1(PipelineDriver.scala:41) at scala.util.Try$.apply(Try.scala:213) at com.sizmek.dp.dsp.pipeline.driver.PipelineDriver.main(PipelineDriver.scala:34) at com.sizmek.dp.dsp.pipeline.driver.PipelineDriver.main$(PipelineDriver.scala:17) at com.zetaglobal.dp.dsp.jobs.dealstats.DealStatsDriver$.main(DealStatsDriver.scala:18) at com.zetaglobal.dp.dsp.jobs.dealstats.DealStatsDriver.main(DealStatsDriver.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:684)
This ends up with very empty file in HDFS and empty job details in history server.
Problem seems to by introduced by this change:
https://github.com/apache/beam/pull/14409
Why Beam runs concurrent event listener to what spark is doing internally? When I roll back change for SparkRunner, problem disappear for me.
I am running native SparkRunner with Spark 2.4.4
Attachments
Issue Links
- is caused by
-
BEAM-11213 Beam metrics should be displayed in Spark UI
- Resolved
- links to