Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-18162

Tracebacks of exceptions should always be available

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.0.0
    • None
    • None

    Description

      We have a bunch of places similar to this (especially in python RU code):

      try:
      dfsadmin_base_command = get_dfsadmin_base_command(hdfs_binary)
      command = dfsadmin_base_command + ' -report -live'
      return_code, hdfs_output = shell.call(command, user=params.hdfs_user)
      except:
      raise Fail('Unable to determine if the DataNode has started after upgrade.')

      Where the actual valuable information is just masked by re-raising exception
      and saying "something went wrong sorry". This makes issues very problematic to
      debug, especially on other users side, where it's often hard to receive any
      information.

      The solution would be to make Fail exception class to print the causing
      exception. Similar to what is done by default in Python3.

      Example:

      Traceback (most recent call last):
      File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py", line 150, in service_check
      Execute("hive mkdir /a")
      File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155, in _init_
      self.env.run()
      File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run
      self.run_action(resource, action)
      File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action
      provider_action()
      File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 273, in action_run
      tries=self.resource.tries, try_sleep=self.resource.try_sleep)
      File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 71, in inner
      result = function(command, **kwargs)
      File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 93, in checked_call
      tries=tries, try_sleep=try_sleep)
      File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 141, in _call_wrapper
      result = _call(command, **kwargs_copy)
      File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 294, in _call
      raise Fail(err_msg)
      Fail: Execution of 'hive mkdir /a' returned 127. /bin/bash: hive: command not found

      The above exception was the direct cause of the following exception:

      Traceback (most recent call last):
      File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py", line 164, in <module>
      ServiceCheck().execute()
      File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 280, in execute
      method(env)
      File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py", line 152, in service_check
      raise Fail(format("Something went wrong"))
      resource_management.core.exceptions.Fail: Something went wrong

      Attachments

        1. AMBARI-18162.patch
          2 kB
          Andrew Onischuk

        Issue Links

          Activity

            People

              aonishuk Andrew Onischuk
              aonishuk Andrew Onischuk
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: