Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
1.6.1
-
None
-
Ubuntu 14.04. Zookeeper 3.4.6 with 3-node quorum
Description
Shutting Down a single zookeeper node caused spark master to exit. The master should have connected to a second zookeeper node.
log output
16/05/25 18:21:28 INFO master.Master: Launching executor app-20160525182128-0006/1 on worker worker-20160524013212-10.16.28.76-59138 16/05/25 18:21:28 INFO master.Master: Launching executor app-20160525182128-0006/2 on worker worker-20160524013204-10.16.21.217-47129 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x154dfc0426b0054, likely server has closed socket, closing socket connection and attempting reconnect 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x254c701f28d0053, likely server has closed socket, closing socket connection and attempting reconnect 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED 16/05/26 00:16:01 INFO master.ZooKeeperLeaderElectionAgent: We have lost leadership 16/05/26 00:16:01 ERROR master.Master: Leadership has been revoked -- master shutting down. }}
spark-env.sh:
spark-env.sh
export SPARK_LOCAL_DIRS=/ephemeral/spark/local export SPARK_WORKER_DIR=/ephemeral/spark/work export SPARK_LOG_DIR=/var/log/spark export HADOOP_CONF_DIR=/home/ubuntu/hadoop-2.6.3/etc/hadoop export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=gn5456-zookeeper-01:2181,gn5456-zookeeper-02:2181,gn5456-zookeeper-03:2181" export SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true"
Attachments
Issue Links
- is cloned by
-
SPARK-49546 Zookeeper node interruption causes Active spark master to exit
- Open