Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
Slider 0.91
-
None
-
RHEL-6 (64 Bit)
Description
PROBLEM :
Customer has created a Slider App by passing zookeeper quorum using below command :
slider create test --template appConfig.json --resources resources.json --zkhosts sandy234new1.hwxblr.com:2181,sandy234new3.hwxblr.com:2181,sandy234new2.hwxblr.com:2181
Below is the application log, which show us that it only picks the 1st zookeeper.
2016-09-15 15:44:29,052 [main] INFO appmaster.SliderAppMaster - Loading slider-server.xml at file:/hadoop/yarn/local/usercache/root/appcache/application_1473930641993_0005/container_e04_1473930641993_0005_01_000001/confdir/slider-server.xml 2016-09-15 15:44:29,077 [main] INFO appmaster.SliderAppMaster - AM configuration: hadoop.registry.zk.quorum=sandy234new1.hwxblr.com:2181 hadoop.registry.zk.root=/registry yarn.resourcemanager.scheduler.address=0.0.0.0:8030
BUSINESS IMPACT : Slider throws exceptions when 1st zookeeper goes down (Since it only picks 1st zookeeper) and this is impacting the AM.
STEPS TO REPRODUCE:
Launch a Hbase app using step 1 & 2.
1) slider create test --template appConfig.json --resources resources.json --zkhosts sandy234new1.hwxblr.com:2181,sandy234new3.hwxblr.com:2181,sandy234new2.hwxblr.com:2181
This will launch a application in RM.
From the RM UI --> application -> logs
first line will be as below :
2016-09-15 15:44:29,052 [main] INFO appmaster.SliderAppMaster - Loading slider-server.xml at file:/hadoop/yarn/local/usercache/root/appcache/application_1473930641993_0005/container_e04_1473930641993_0005_01_000001/confdir/slider-server.xml 2016-09-15 15:44:29,077 [main] INFO appmaster.SliderAppMaster - AM configuration: hadoop.registry.zk.quorum=sandy234new1.hwxblr.com:2181 hadoop.registry.zk.root=/registry yarn.resourcemanager.scheduler.address=0.0.0.0:8030