Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.0.0, 2.1.0
-
None
-
ubuntu 12.04
Description
We are testing deploying an HDP 2.2. Cluster using ambari 2.0.0-rc2 running on ubuntu 12.04. I’ve been able to set up a cluster running HDFS, MapReduce2, YARN, Zookeeper, Knox, Ranger, and Ambari Metrics. When I shut down the whole cluster using Actions -> Stop All in Ambari, reboot the hosts, and then try to restart the cluster I see the error below restarting the Knox gateway. The directory /var/run/knox is indeed missing on the master host.
Knox Gateway startup log:
2015-04-01 16:17:12,075 - Error while executing command 'start':
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 214, in execute
method(env)
File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk
return fn(*args, **kwargs)
File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py", line 80, in start
self.configure(env)
File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py", line 64, in configure
knox()
File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk
return fn(*args, **kwargs)
File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox.py", line 99, in knox
sudo = True,
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in _init_
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 274, in action_run
raise ex
Fail: Execution of 'chown -R knox:knox /var/lib/knox/data /var/log/knox /var/log/knox /var/run/knox /etc/knox/conf' returned 1. chown: cannot access `/var/run/knox': No such file or directory
stdout: /var/lib/ambari-agent/data/output-107.txt
2015-04-01 16:17:06,744 - u"Group['hadoop']"
{'ignore_failures': False}2015-04-01 16:17:06,744 - Modifying group hadoop
2015-04-01 16:17:06,797 - u"Group['users']"
2015-04-01 16:17:06,797 - Modifying group users
2015-04-01 16:17:06,839 - u"Group['knox']"
2015-04-01 16:17:06,839 - Modifying group knox
2015-04-01 16:17:06,886 - u"Group['ranger']"
2015-04-01 16:17:06,886 - Modifying group ranger
2015-04-01 16:17:06,930 - u"User['mapred']"
2015-04-01 16:17:06,930 - Modifying user mapred
2015-04-01 16:17:06,976 - u"User['root']"
2015-04-01 16:17:06,977 - Modifying user root
2015-04-01 16:17:07,019 - u"User['ambari-qa']"
2015-04-01 16:17:07,020 - Modifying user ambari-qa
2015-04-01 16:17:07,066 - u"User['zookeeper']"
2015-04-01 16:17:07,066 - Modifying user zookeeper
2015-04-01 16:17:07,109 - u"User['rangerlogger']"
2015-04-01 16:17:07,110 - Modifying user rangerlogger
2015-04-01 16:17:07,152 - u"User['hdfs']"
2015-04-01 16:17:07,152 - Modifying user hdfs
2015-04-01 16:17:07,195 - u"User['knox']"
2015-04-01 16:17:07,195 - Modifying user knox
2015-04-01 16:17:07,238 - u"User['ranger']"
2015-04-01 16:17:07,238 - Modifying user ranger
2015-04-01 16:17:07,282 - u"User['yarn']"
2015-04-01 16:17:07,283 - Modifying user yarn
2015-04-01 16:17:07,326 - u"User['ams']"
2015-04-01 16:17:07,327 - Modifying user ams
2015-04-01 16:17:07,370 - u"User['rangeradmin']"
2015-04-01 16:17:07,370 - Modifying user rangeradmin
2015-04-01 16:17:07,413 - u"File['/var/lib/ambari-agent/data/tmp/changeUid.sh']"
2015-04-01 16:17:07,686 - u"Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa']"
{'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'}2015-04-01 16:17:07,728 - Skipping u"Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa']" due to not_if
2015-04-01 16:17:07,728 - u"Group['hdfs']"
2015-04-01 16:17:07,728 - Modifying group hdfs
2015-04-01 16:17:07,774 - u"User['hdfs']"
2015-04-01 16:17:07,775 - Modifying user hdfs
2015-04-01 16:17:07,818 - u"Directory['/etc/hadoop']"
2015-04-01 16:17:07,974 - u"Directory['/etc/hadoop/conf.empty']"
{'owner': 'root', 'group': 'hadoop', 'recursive': True}2015-04-01 16:17:08,110 - u"Link['/etc/hadoop/conf']"
{'not_if': 'ls /etc/hadoop/conf', 'to': '/etc/hadoop/conf.empty'}2015-04-01 16:17:08,153 - Skipping u"Link['/etc/hadoop/conf']" due to not_if
2015-04-01 16:17:08,160 - u"File['/etc/hadoop/conf/hadoop-env.sh']"
2015-04-01 16:17:08,396 - u"Execute['('setenforce', '0')']"
{'sudo': True, 'only_if': 'test -f /selinux/enforce'}2015-04-01 16:17:08,448 - Skipping u"Execute['('setenforce', '0')']" due to only_if
2015-04-01 16:17:08,448 - u"Directory['/var/log/hadoop']"
2015-04-01 16:17:08,843 - u"Directory['/var/run/hadoop']"
{'owner': 'root', 'group': 'root', 'recursive': True, 'cd_access': 'a'}2015-04-01 16:17:08,886 - Creating directory u"Directory['/var/run/hadoop']"
2015-04-01 16:17:09,066 - Changing group for /var/run/hadoop from 1000 to root
2015-04-01 16:17:09,364 - u"Directory['/tmp/hadoop-hdfs']"
2015-04-01 16:17:09,407 - Creating directory u"Directory['/tmp/hadoop-hdfs']"
2015-04-01 16:17:09,587 - Changing owner for /tmp/hadoop-hdfs from 0 to hdfs
2015-04-01 16:17:09,820 - u"File['/etc/hadoop/conf/commons-logging.properties']"
2015-04-01 16:17:10,049 - u"File['/etc/hadoop/conf/health_check']"
{'content': Template('health_check-v2.j2'), 'owner': 'hdfs'}2015-04-01 16:17:10,272 - u"File['/etc/hadoop/conf/log4j.properties']"
{'content': '...', 'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644}2015-04-01 16:17:10,506 - u"File['/etc/hadoop/conf/hadoop-metrics2.properties']"
{'content': Template('hadoop-metrics2.properties.j2'), 'owner': 'hdfs'}2015-04-01 16:17:10,732 - u"File['/etc/hadoop/conf/task-log4j.properties']"
{'content': StaticFile('task-log4j.properties'), 'mode': 0755}2015-04-01 16:17:11,085 - u"Directory['/etc/knox/conf']"
{'owner': 'knox', 'group': 'knox', 'recursive': True}2015-04-01 16:17:11,231 - u"XmlConfig['gateway-site.xml']" {'owner': 'knox', 'group': 'knox', 'conf_dir': '/etc/knox/conf', 'configuration_attributes': {}, 'configurations': ...}
2015-04-01 16:17:11,239 - Generating config: /etc/knox/conf/gateway-site.xml
2015-04-01 16:17:11,239 - u"File['/etc/knox/conf/gateway-site.xml']"
2015-04-01 16:17:11,422 - Writing u"File['/etc/knox/conf/gateway-site.xml']" because contents don't match
2015-04-01 16:17:11,561 - u"File['/etc/knox/conf/gateway-log4j.properties']"
2015-04-01 16:17:11,790 - u"File['/etc/knox/conf/topologies/default.xml']"
{'content': InlineTemplate(...), 'owner': 'knox', 'group': 'knox'}2015-04-01 16:17:12,014 - u"Execute['('chown', '-R', u'knox:knox', '/var/lib/knox/data', '/var/log/knox', '/var/log/knox', u'/var/run/knox', '/etc/knox/conf')']"
{'sudo': True}2015-04-01 16:17:12,075 - Error while executing command 'start':
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 214, in execute
method(env)
File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk
return fn(*args, **kwargs)
File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py", line 80, in start
self.configure(env)
File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py", line 64, in configure
knox()
File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk
return fn(*args, **kwargs)
File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox.py", line 99, in knox
sudo = True,
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in _init_
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 274, in action_run
raise ex
Fail: Execution of 'chown -R knox:knox /var/lib/knox/data /var/log/knox /var/log/knox /var/run/knox /etc/knox/conf' returned 1. chown: cannot access `/var/run/knox': No such file or directory
2015-04-01 16:17:12,119 - Command: /usr/bin/hdp-select status knox-server > /tmp/tmp7GgVe1
Output: knox-server - 2.2.0.0-2041
Attachments
Attachments
Issue Links
- is related to
-
AMBARI-10413 Knox gateway fails to restart on Ubuntu 12.04 after system restart using custom pid dir because /usr/hdp/current/knox-server/pids does not point to custom pid dir
- Resolved
-
AMBARI-10417 Flume fails to restart on ubuntu 12.04 after system restart because /var/run/flume is deleted
- Resolved
- links to