Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-18922

Agent Auto Restart Doesn't Release Ping Port

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 2.5.0
    • 2.5.0
    • ambari-agent
    • None

    Description

      Agent auto-restart fails with

      INFO 2016-11-10 17:56:58,319 security.py:148 - Encountered communication error. Details: error(104, 'Connection reset by peer')
      ERROR 2016-11-10 17:56:58,320 Controller.py:425 - Connection to 192.168.64.1 was lost (details=Request to https://192.168.64.1:8441/agent/v1/heartbeat/c6401.ambari.apache.org failed due to Error occured during connecting to the server: [Errno 104] Connection reset by peer)
      INFO 2016-11-10 17:57:33,233 Controller.py:285 - Heartbeat (response id = 1157) with server is running...
      INFO 2016-11-10 17:57:33,233 NetUtil.py:62 - Connecting to https://192.168.64.1:8440/connection_info
      INFO 2016-11-10 17:57:33,300 security.py:100 - SSL Connect being called.. connecting to the server
      INFO 2016-11-10 17:57:33,366 security.py:61 - SSL connection established. Two-way SSL authentication is turned off on the server.
      ERROR 2016-11-10 17:57:33,368 Controller.py:349 - Error in responseId sequence - restarting
      INFO 2016-11-10 17:57:33,369 ExitHelper.py:53 - Performing cleanup before exiting...
      INFO 2016-11-10 17:57:33,369 threadpool.py:112 - Shutting down thread pool
      INFO 2016-11-10 17:57:33,409 scheduler.py:607 - Scheduler has been shut down
      INFO 2016-11-10 17:57:33,409 threadpool.py:52 - Started thread pool with 3 core threads and 20 maximum threads
      INFO 2016-11-10 17:57:33,410 AlertSchedulerHandler.py:166 - [AlertScheduler] Stopped the alert scheduler.
      INFO 2016-11-10 17:57:33,410 threadpool.py:112 - Shutting down thread pool
      INFO 2016-11-10 17:57:33,410 ExitHelper.py:67 - Cleanup finished, exiting with code:77
      INFO 2016-11-10 17:57:33,544 main.py:96 - loglevel=logging.INFO
      INFO 2016-11-10 17:57:33,544 main.py:96 - loglevel=logging.INFO
      INFO 2016-11-10 17:57:33,544 main.py:96 - loglevel=logging.INFO
      INFO 2016-11-10 17:57:33,545 DataCleaner.py:39 - Data cleanup thread started
      INFO 2016-11-10 17:57:33,547 DataCleaner.py:120 - Data cleanup started
      INFO 2016-11-10 17:57:33,548 DataCleaner.py:122 - Data cleanup finished
      ERROR 2016-11-10 17:57:33,573 main.py:377 - Failed to start ping port listener of: Could not open port 8670 because port already used by another process:
      UID        PID  PPID  C STIME TTY          TIME CMD
      root      4750     1  0 17:34 pts/0    00:00:00 /usr/bin/python /usr/lib/python2
      
      INFO 2016-11-10 17:57:33,574 PingPortListener.py:61 - Ping port listener killed
      INFO 2016-11-10 17:57:33,574 ExitHelper.py:53 - Performing cleanup before exiting...
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            dsen Dmytro Sen
            dsen Dmytro Sen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment