Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-24455

Spark service check failure after UI Deploy - non Secure with Ranger/KMS

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Invalid
    • Affects Version/s: 2.7.1
    • Fix Version/s: 2.7.1
    • Component/s: ambari-server
    • Labels:
      None

      Description

      Spark Service check fails while UI Deploy:

      stderr: 
      Traceback (most recent call last):
        File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/SPARK2/package/scripts/service_check.py", line 78, in service_check
          Execute(cmd, user=params.smoke_user, path=[beeline_cmd], timeout=CHECK_COMMAND_TIMEOUT_DEFAULT)
        File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__
          self.env.run()
        File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run
          self.run_action(resource, action)
        File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action
          provider_action()
        File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 263, in action_run
          returns=self.resource.returns)
        File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner
          result = function(command, **kwargs)
        File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call
          tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy, returns=returns)
        File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper
          result = _call(command, **kwargs_copy)
        File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 314, in _call
          raise ExecutionFailed(err_msg, code, out, err)
      ExecutionFailed: Execution of '! /usr/hdp/current/spark2-client/bin/beeline -u 'jdbc:hive2://ctr-e138-1518143905142-432870-01-000004.hwx.site:10016/default' transportMode=binary  -e '' 2>&1| awk '{print}'|grep -i -e 'Connection refused' -e 'Invalid URL' -e 'Error: Could not open'' returned 1. ######## Hortonworks #############
      This is MOTD message, added for testing in qe infra
      Error: Could not open client transport with JDBC Uri: jdbc:hive2://ctr-e138-1518143905142-432870-01-000004.hwx.site:10016/default: java.net.ConnectException: Connection refused (Connection refused) (state=08S01,code=0)
      Error: Could not open client transport with JDBC Uri: jdbc:hive2://ctr-e138-1518143905142-432870-01-000004.hwx.site:10016/default: java.net.ConnectException: Connection refused (Connection refused) (state=08S01,code=0)
      
      The above exception was the cause of the following exception:
      
      Traceback (most recent call last):
        File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/SPARK2/package/scripts/service_check.py", line 88, in 
          SparkServiceCheck().execute()
        File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 353, in execute
          method(env)
        File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/SPARK2/package/scripts/service_check.py", line 85, in service_check
          raise Fail("Connection to all Spark thrift servers servers failed")
      resource_management.core.exceptions.Fail: Connection to all Spark thrift servers servers failed
       stdout:
      2018-08-10 05:22:29,576 - Using hadoop conf dir: /usr/hdp/3.0.1.0-73/hadoop/conf
      2018-08-10 05:22:29,600 - Execute['curl -s -o /dev/null -w'%{http_code}' --negotiate -u: -k http://ctr-e138-1518143905142-432870-01-000004.hwx.site:18081 | grep 200'] {'logoutput': True, 'tries': 5, 'user': 'ambari-qa', 'try_sleep': 3}
      ######## Hortonworks #############
      This is MOTD message, added for testing in qe infra
      200
      2018-08-10 05:22:29,794 - Execute['curl -s -o /dev/null -w'%{http_code}' --negotiate -u: -k http://ctr-e138-1518143905142-432870-01-000004.hwx.site:8999/sessions | grep 200'] {'logoutput': True, 'tries': 3, 'user': 'ambari-qa', 'try_sleep': 1}
      ######## Hortonworks #############
      This is MOTD message, added for testing in qe infra
      200
      2018-08-10 05:22:29,843 - Execute['! /usr/hdp/current/spark2-client/bin/beeline -u 'jdbc:hive2://ctr-e138-1518143905142-432870-01-000004.hwx.site:10016/default' transportMode=binary  -e '' 2>&1| awk '{print}'|grep -i -e 'Connection refused' -e 'Invalid URL' -e 'Error: Could not open''] {'path': [u'/usr/hdp/current/spark2-client/bin/beeline'], 'user': 'ambari-qa', 'timeout': 60.0}
      
      Command failed after 1 tries
      

      I checked the schedule of commands and observed that Spark Thrift server start is scheduled to run after Spark2 service check is executed.

        Attachments

          Activity

            People

            • Assignee:
              hapylestat Dmytro Grinenko
              Reporter:
              sjanardhan Srikanth Janardhan
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: