Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
We have a bunch of places similar to this (especially in python RU code):
try:
dfsadmin_base_command = get_dfsadmin_base_command(hdfs_binary)
command = dfsadmin_base_command + ' -report -live'
return_code, hdfs_output = shell.call(command, user=params.hdfs_user)
except:
raise Fail('Unable to determine if the DataNode has started after upgrade.')
Where the actual valuable information is just masked by re-raising exception
and saying "something went wrong sorry". This makes issues very problematic to
debug, especially on other users side, where it's often hard to receive any
information.
The solution would be to make Fail exception class to print the causing
exception. Similar to what is done by default in Python3.
Example:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py", line 150, in service_check
Execute("hive mkdir /a")
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155, in _init_
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 273, in action_run
tries=self.resource.tries, try_sleep=self.resource.try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 71, in inner
result = function(command, **kwargs)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 93, in checked_call
tries=tries, try_sleep=try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 141, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 294, in _call
raise Fail(err_msg)
Fail: Execution of 'hive mkdir /a' returned 127. /bin/bash: hive: command not found
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py", line 164, in <module>
ServiceCheck().execute()
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 280, in execute
method(env)
File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py", line 152, in service_check
raise Fail(format("Something went wrong"))
resource_management.core.exceptions.Fail: Something went wrong
Attachments
Attachments
Issue Links
- links to