Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Incomplete
-
2.2.0
-
None
-
None
-
linux
YARN cluster mode
Description
#! /usr/bin/env python # -*- coding: utf-8 -*- from pyspark import SparkContext from pyspark.sql import SparkSession if __name__ == '__main__': spark = SparkSession.builder.appName('sparktest').getOrCreate() # Other code is omitted below
The code is simple, but occasionally throws the following exception:
19/04/15 21:30:00 ERROR yarn.ApplicationMaster: Uncaught exception:
java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201)
at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:400)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:253)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:771)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:69)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:68)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1743)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:68)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:769)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
I know spark.yarn.am.waitTime can increase the sparkcontext initialization time.
Why does SparkContext initialization take so long?