Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.1.0
    • Component/s: None
    • Labels:
      None

      Description

      Agent crashes on a heartbeat response for installing a role

        Activity

        Hide
        Devaraj Das added a comment -

        I am not sure if that's the right solution. If the 'user' is expected and that's missing, we should fix it in the controller.

        Show
        Devaraj Das added a comment - I am not sure if that's the right solution. If the 'user' is expected and that's missing, we should fix it in the controller.
        Hide
        Eric Yang added a comment -

        The run action does not contain the required parameter "user" for running this action. I will add code to safe guard Agent from crashing for invalid action.

        Show
        Eric Yang added a comment - The run action does not contain the required parameter "user" for running this action. I will add code to safe guard Agent from crashing for invalid action.
        Hide
        Devaraj Das added a comment -

        Steps to reproduce (on a single node):
        0. Run the controller
        1. Create a cluster: ./bin/ambari client cluster create -name ddas1 -stack hadoop-security -goalstate ACTIVE -nodes <node1> -role namenode=<node1> -role datanode=<node1> -services hdfs
        2. Start the agent. Crashes:
        2011-10-30 23:51:10,611 Controller.py:77 - {"hardwareProfile":

        {"ramSize": 0, "netSpeed": 54, "coreCount": 8, "diskCount": 0, "cpuSpeed": 0, "cpuFlag": ""}

        , "timestamp": 1320043870610, "idle": false, "hostname": "<node1>", "responseId": 3}
        Exception in thread Thread-2:
        Traceback (most recent call last):
        File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/threading.py", line 522, in __bootstrap_inner
        self.run()
        File "/Users/ddas/workspace/hms-trunk/agent/src/main/python/ambari_agent/ActionQueue.py", line 73, in run
        result = switches.get(action['kind'], self.unknownAction)(action)
        File "/Users/ddas/workspace/hms-trunk/agent/src/main/python/ambari_agent/ActionQueue.py", line 129, in runAction
        action['user'],
        KeyError: 'user'

        Show
        Devaraj Das added a comment - Steps to reproduce (on a single node): 0. Run the controller 1. Create a cluster: ./bin/ambari client cluster create -name ddas1 -stack hadoop-security -goalstate ACTIVE -nodes <node1> -role namenode=<node1> -role datanode=<node1> -services hdfs 2. Start the agent. Crashes: 2011-10-30 23:51:10,611 Controller.py:77 - {"hardwareProfile": {"ramSize": 0, "netSpeed": 54, "coreCount": 8, "diskCount": 0, "cpuSpeed": 0, "cpuFlag": ""} , "timestamp": 1320043870610, "idle": false, "hostname": "<node1>", "responseId": 3} Exception in thread Thread-2: Traceback (most recent call last): File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/threading.py", line 522, in __bootstrap_inner self.run() File "/Users/ddas/workspace/hms-trunk/agent/src/main/python/ambari_agent/ActionQueue.py", line 73, in run result = switches.get(action ['kind'] , self.unknownAction)(action) File "/Users/ddas/workspace/hms-trunk/agent/src/main/python/ambari_agent/ActionQueue.py", line 129, in runAction action ['user'] , KeyError: 'user'

          People

          • Assignee:
            Eric Yang
            Reporter:
            Devaraj Das
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development