Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
1.4.1
-
None
-
None
Description
With Spark 1.4.1 and YARN client mode, my application works at the first time the cluster is built. While if I stop and start the cluster with using spark-ec2, the same command fails. At the end of the spark logs, it's shown that it just keeps trying to connect to master node repeatedly:
INFO Client: Retrying connect to server: ec2-54-174-232-129.compute-1.amazonaws.com/172.31.36.29:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
I restarted YARN and dfs manually after restarting the cluster, however, I was unable to restart Tachyon and it fails when running ./bin/tachyon runTests, which might be the possible reason.