Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.6.1
-
None
Description
STR:
- Installed a 3-node cluster for HDP 1.3 stack HDFS+MapReduce+Nagios+Ganglia+zooKeeper installed with slave components installed on all 3 hosts.
- Enable security with no kerberos setup
- On expected failure of security wizard, Disable security.
- After successfully disabling security, Following API returns incorrect number for started_count of Datanode. It says 0 but Datanode is actually running on all hosts
http://server:8080/api/v1/clusters/c1/components/?ServiceComponentInfo/category.in(SLAVE,CLIENT)&fields=ServiceComponentInfo/service_name,ServiceComponentInfo/installed_count,ServiceComponentInfo/started_count,ServiceComponentInfo/total_count&minimal_response=true
Reason:
During wrong kerberos setup DN processes fail to start, but leave stale pid file owned by root. Next one DN start command starts DN process, but can not override pid file. So the server considers DN as stopped. If we start DN once more, commands fail soon after start (due to lock file at data dir owned by already running DN). Agent reports to server that DN is not running, so server displays a correct information from his point of view.
Attachments
Issue Links
- links to