Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-6978

Uncatched exception at ambari agent - it may die on connection error

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.7.0
    • 1.7.0
    • ambari-agent, test
    • None

    Description

      I've got into this situation on 2 agent hosts after I've upgraded ambari-server, reset database and restarted ambari-server few times. Probably there was a rare case when agent got connection exception during registration, and it was not catched. So agent registration failed, and I had to go to agent host and to start agent manually.

      The expected behavior for an agent is to ignore any server-side connection problems and stay alive.

      INFO 2014-08-19 19:02:03,899 NetUtil.py:48 - Connecting to https://vm-0.vm:8440/connection_info
      WARNING 2014-08-19 19:02:03,900 NetUtil.py:71 - Failed to connect to https://vm-0.vm:8440/connection_info due to [Errno 111] Connection refused  
      DEBUG 2014-08-19 19:02:03,900 security.py:47 - Server two-way SSL authentication required: False
      INFO 2014-08-19 19:02:03,900 security.py:93 - SSL Connect being called.. connecting to the server
      DEBUG 2014-08-19 19:02:03,901 security.py:134 - Error in sending/receving data from the server Traceback (most recent call last):
        File "/usr/lib/python2.6/site-packages/ambari_agent/security.py", line 128, in request
          req.get_data(), req.headers)
        File "/usr/lib64/python2.6/httplib.py", line 920, in request
          self._send_request(method, url, body, headers)
        File "/usr/lib64/python2.6/httplib.py", line 951, in _send_request
          self.endheaders()
        File "/usr/lib64/python2.6/httplib.py", line 908, in endheaders
          self._send_output()
        File "/usr/lib64/python2.6/httplib.py", line 780, in _send_output
          self.send(msg)
        File "/usr/lib64/python2.6/httplib.py", line 739, in send
          self.connect()
        File "/usr/lib/python2.6/site-packages/ambari_agent/security.py", line 53, in connect
          sock = self.create_connection()
        File "/usr/lib/python2.6/site-packages/ambari_agent/security.py", line 94, in create_connection
          sock = socket.create_connection((self.host, self.port), 60)
        File "/usr/lib64/python2.6/socket.py", line 567, in create_connection
          raise error, msg
      error: [Errno 111] Connection refused
      
      INFO 2014-08-19 19:02:03,901 security.py:135 - Encountered communication error. Details: error(111, 'Connection refused')
      ERROR 2014-08-19 19:02:03,901 Controller.py:115 - Request to https://vm-0.vm:8441/agent/v1/register/vm-2.vm failed due to Error occured during connecting to the server: [Errno 111] Connection refused
      INFO 2014-08-19 19:02:03,907 main.py:55 - signal received, exiting.
      INFO 2014-08-19 19:02:03,907 ProcessHelper.py:39 - Removing pid file
      INFO 2014-08-19 19:02:03,907 ProcessHelper.py:46 - Removing temp files
      

      Attachments

        Issue Links

          Activity

            People

              dmitriusan Dmitry Lysnichenko
              dmitriusan Dmitry Lysnichenko
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: