Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-21547

Race condition while calculating correlation_id

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.0.0
    • None
    • None

    Description

      INFO 2017-07-21 11:35:35,167 security.py:118 - Event to server at /heartbeat (correlation_id=146):

      {'id': 129}

      INFO 2017-07-21 11:35:35,168 transport.py:267 - Sending frame: 'SEND', headers=

      {'content-length': 11, 'destination': '/heartbeat', 'correlationId': 146}

      INFO 2017-07-21 11:35:35,169 ActionQueue.py:313 - Quit retrying for command with taskId = 6382. Status: COMPLETED, retryAble: False, retryDuration (sec): -1, last delay (sec): 1
      INFO 2017-07-21 11:35:35,169 ActionQueue.py:328 - Command with taskId = 6382 completed successfully!
      INFO 2017-07-21 11:35:35,170 security.py:118 - Event to server at /reports/commands_status (correlation_id=147): {'clusters': {'2': [{'status': 'COMPLETED', 'stderr': 'None', 'stdout': "Before Any Hook\nBefore Install Hook\n2017-07-21 11:35:22,915 - Execute['cp -n /home/test_perf-0450/var/lib/ambari-agent/cache/stacks/PERF/1.0/hooks/before-INSTALL/scripts/conf-select.py /usr/bin/conf-select']

      {'user': 'root'}

      \n2017-07-21 11:35:23,835 - Execute['chmod a+x /usr/bin/conf-select']

      {'user': 'root'}

      \n2017-07-21 11:35:24,747 - Execute['cp -n /home/test_perf-0450/var/lib/ambari-agent/cache/stacks/PERF/1.0/hooks/before-INSTALL/scripts/distro-select.py /usr/bin/distro-select']

      {'user': 'root'}

      \n2017-07-21 11:35:25,710 - Execute['chmod a+x /usr/bin/distro-select']

      {'user': 'root'}

      \nInstall\nHost: test_perf-0450\nComponent: FAKEJOURNALNODE\nPid File: /var/run/test_perf-0450/FAKEJOURNALNODE.pid\nAfter Install Hook\n\nCommand completed successfully!\n", 'roleCommand': 'INSTALL', 'structuredOut': '{}', 'clusterId': '2', 'serviceName': 'FAKEHDFS', 'role': 'FAKEJOURNALNODE', 'actionId': '3-0', 'taskId': 6382, 'exitCode': 0}]}}
      INFO 2017-07-21 11:35:35,170 transport.py:267 - Sending frame: 'SEND', headers=

      {'content-length': 1014, 'destination': '/reports/commands_status', 'correlationId': 147}

      INFO 2017-07-21 11:35:35,171 ActionQueue.py:224 - Executing command with id = 3-0, taskId = 6383 for role = FAKENFS_GATEWAY of cluster_id 2.
      INFO 2017-07-21 11:35:35,171 security.py:118 - Event to server at /reports/commands_status (correlation_id=148): {'clusters': {'2': [

      {'status': 'IN_PROGRESS', 'tmperr': '/home/test_perf-0450/var/lib/ambari-agent/data/errors-6383.txt', 'tmpout': '/home/test_perf-0450/var/lib/ambari-agent/data/output-6383.txt', 'roleCommand': 'INSTALL', 'structuredOut': '/home/test_perf-0450/var/lib/ambari-agent/data/structured-out-6383.json', 'clusterId': '2', 'serviceName': 'FAKEHDFS', 'role': 'FAKENFS_GATEWAY', 'actionId': '3-0', 'taskId': 6383}

      ]}}
      INFO 2017-07-21 11:35:35,172 transport.py:267 - Sending frame: 'SEND', headers=

      {'content-length': 425, 'destination': '/reports/commands_status', 'correlationId': 148}

      INFO 2017-07-21 11:35:35,172 ActionQueue.py:265 - Command execution metadata - taskId = 6383, retry enabled = False, max retry duration (sec) = 0, log_output = True
      INFO 2017-07-21 11:35:35,701 transport.py:187 - Received frame: 'MESSAGE', headers=

      {'content-length': '10', 'destination': '/user/', 'message-id': '16feeef1-94472', 'content-type': 'application/json;charset=UTF-8', 'correlationId': '146', 'subscription': 'sub'}

      , len(body)=10
      INFO 2017-07-21 11:35:35,701 _init_.py:47 - Event from server at /user/ (correlation_id=146):

      {'id': 130}

      ERROR 2017-07-21 11:35:45,793 HeartbeatThread.py:91 - Exception in HeartbeatThread. Re-running the registrationTraceback (most recent call last): File "/usr/lib/python2.6/site-packages/ambari_agent/HeartbeatThread.py", line 86, in run response = self.blocking_request(heartbeat_body, Constants.HEARTBEAT_ENDPOINT) File "/usr/lib/python2.6/site-packages/ambari_agent/HeartbeatThread.py", line 204, in blocking_request raise Exception("

      {0}

      seconds timeout expired waiting for response from server at

      {1}

      to message from

      {2}

      ".format(timeout, Constants.SERVER_RESPONSES_TOPIC, destination))Exception: 10 seconds timeout expired waiting for response from server at /user/ to message from /heartbeat

      Attachments

        1. AMBARI-21547.patch
          1 kB
          Andrew Onischuk

        Issue Links

          Activity

            People

              aonishuk Andrew Onischuk
              aonishuk Andrew Onischuk
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: