Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Slider 0.91
    • Slider 0.92
    • appmaster, client
    • None

    Description

      This is a sample JSON structure of the proposed diagnostics resource -

      {
        "finalStatus": "SUCCEEDED", 
        "finalMessage": "stop command issued", 
        "containers": [
          {
            "containerId": "container_e3374_1485226679409_0016_01_000004", 
            "component": "COMMAND_LOGGER", 
            "appVersion": "1.0.0", 
            "state": 3, 
            "exitCode": -1000, 
            "diagnostics": "", 
            "createTime": 1485285533968, 
            "startTime": 1485285533989, 
            "host": "cn008.l42scl.hortonworks.com", 
            "hostURL": "http://cn008.l42scl.hortonworks.com:8042", 
            "logLink": "http://cn007.l42scl.hortonworks.com:19888/jobhistory/logs/cn008.l42scl.hortonworks.com:45454/container_e3374_1485226679409_0016_01_000004/ctx/root"
          }, 
          {
            "containerId": "container_e3374_1485226679409_0016_01_000003", 
            "component": "COMMAND_LOGGER", 
            "appVersion": "1.0.0", 
            "state": 3, 
            "exitCode": -1000, 
            "diagnostics": "", 
            "createTime": 1485285120456, 
            "startTime": 1485285120723, 
            "host": "cn005.l42scl.hortonworks.com", 
            "hostURL": "http://cn005.l42scl.hortonworks.com:8042", 
            "logLink": "http://cn007.l42scl.hortonworks.com:19888/jobhistory/logs/cn005.l42scl.hortonworks.com:45454/container_e3374_1485226679409_0016_01_000003/ctx/root"
          }, 
          {
            "containerId": "container_e3374_1485226679409_0016_01_000002", 
            "component": "COMMAND_LOGGER", 
            "appVersion": "1.0.0", 
            "state": 4, 
            "exitCode": -100, 
            "diagnostics": "Container released by application", 
            "createTime": 1485285120464, 
            "startTime": 1485285120522, 
            "host": "cn008.l42scl.hortonworks.com", 
            "hostURL": "http://cn008.l42scl.hortonworks.com:8042", 
            "logLink": "http://cn007.l42scl.hortonworks.com:19888/jobhistory/logs/cn008.l42scl.hortonworks.com:45454/container_e3374_1485226679409_0016_01_000002/ctx/root"
          }
        ]
      }
      

      API consumers will need to call SliderClient#actionDiagnosticContainers API to get the ApplicationDiagnostics object. This object has 3 attributes -

      1. finalStatus - app-level status which is empty for a running app (of type org.apache.hadoop.yarn.api.records.FinalApplicationStatus)
      2. finalMessage - app-level summary message which is populated after the app dies
      3. containers - a set of all currently running and all previously failed containers (type org.apache.slider.api.types.ContainerInformation)
        Note, it also contains an additional helper method getContainer(String containerId) which will return the ContainerInformation for a specific container if the container-id is known.

      ContainerInformation (for each running or dead container) contains several attributes which gets updated as and when a container transitions through various stages - like newly created, running, dead, etc. Following are the attributes -

      • containerId
      • component
      • appVersion
      • released (true/false)
      • state (of type org.apache.slider.api.StateValues)
      • exitCode (of type org.apache.hadoop.yarn.api.records.ContainerExitStatus)
      • diagnostics (container level diagnostics message)
      • createTime
      • startTime
      • host
      • hostURL
      • placement
      • output (empty so don't use)
      • logLink (container log link for a live as well as a dead container)
      For an app which is still RUNNING -

      ApplicationDiagnostics object can be retrieved at any point in the app's lifetime by calling the SliderClient#actionDiagnosticContainers(ActionDiagnosticArgs diagnosticArgs) API with only the name field in ActionDiagnosticArgs set to the application name. It can be retrieved on the command-line by calling the diagnostics command with the following arguments -

      slider diagnostics --name <app-name> --containers
      

      On the command-line it is dumped in JSON format.

      For an app which is FAILED/KILLED -

      The ApplicationDiagnostics object is set as YARN application diagnostics and can be retrieved by YARN API or through application command line like -

      yarn application -status <application_id>
      

      Note, the ApplicationDiagnostics object (in JSON format) can also be viewed in RM UI of the application in the Diagnostics: field.

      To retrieve using YARN Client API, this JSON string can be retrieved by calling YarnClient#getApplicationReport(ApplicationId appId) to get the ApplicationReport and then subsequently calling ApplicationReport#getDiagnostics. This JSON string can then be easily converted to the Slider ApplicationDiagnostics object by calling the static method ApplicationDiagnostics#fromJson(String json).

      Attachments

        1. SLIDER-1187.001.patch
          67 kB
          Gour Saha
        2. SLIDER-1187.002.patch
          69 kB
          Gour Saha
        3. SLIDER-1187.003.patch
          70 kB
          Gour Saha
        4. SLIDER-1187.004.patch
          69 kB
          Gour Saha

        Activity

          People

            gsaha Gour Saha
            gsaha Gour Saha
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: