Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-522

[Umbrella] Better reporting for crashed/Killed AMs and Containers

Add voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Crashing AMs has been a real pain for users since the beginning. And there are already a few tickets floating around, filing this to consolidate them.

        Attachments

        Issue Links

        1.
        On container failure, include logs in diagnostics Sub-task Resolved Sandy Ryza Actions
        2.
        Container localization failures aren't reported from NM to RM Sub-task Closed Jian He Actions
        3.
        In YarnClient, pull AM logs on AM container failure Sub-task Open Li Lu Actions
        4.
        Failures in container launches due to issues like disk failure are difficult to diagnose Sub-task Resolved Unassigned Actions
        5.
        Difficult to diagnose a failed container launch when error due to invalid environment variable Sub-task Closed Jian He Actions
        6.
        Localization failures should be available in container diagnostics Sub-task Resolved Vinod Kumar Vavilapalli Actions
        7.
        Improve exception information on AM launch crashes Sub-task Closed Li Lu Actions
        8.
        The diagnostics is always the ExitCodeException stack when the container crashes Sub-task Closed Tsuyoshi Ozawa Actions
        9.
        add more diags about app retry limits on AM failures Sub-task Resolved Steve Loughran Actions
        10.
        Include command line, localization info and env vars on AM launch failure Sub-task Resolved Unassigned Actions
        11.
        Log the command of launching containers Sub-task Resolved Jeff Zhang Actions
        12.
        Add tail of stderr to diagnostics if container fails to launch or it container logs are empty Sub-task Resolved Unassigned Actions
        13.
        LinuxContainerExecutor loses info when forwarding ResourceHandlerException Sub-task Resolved Bibin Chundatt Actions
        14.
        add a way for an attempt to report an attempt failure Sub-task Open Sunil G Actions
        15.
        create a yarn.troubleshooting log for logging app launching problems Sub-task Open Unassigned Actions
        16.
        Add container launch related debug information to container logs when a container fails Sub-task Resolved Varun Vasudev Actions
        17.
        Print enough container-executor logs for troubleshooting container launch failures Sub-task Open Unassigned Actions
        18.
        Capture launch_container.sh logs to a separate log file Sub-task Resolved Suma Shivaprasad Actions
        19.
        When an export var command fails in launch_container.sh, the full container launch should fail Sub-task Resolved Sunil G Actions

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              vinodkv Vinod Kumar Vavilapalli

              Dates

              • Created:
                Updated:

                Issue deployment