Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-522

[Umbrella] Better reporting for crashed/Killed AMs and Containers

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Crashing AMs has been a real pain for users since the beginning. And there are already a few tickets floating around, filing this to consolidate them.

        Attachments

          Issue Links

          1.
          On container failure, include logs in diagnostics Sub-task Resolved Sandy Ryza
          2.
          Container localization failures aren't reported from NM to RM Sub-task Closed Jian He
          3.
          In YarnClient, pull AM logs on AM container failure Sub-task Open Li Lu
          4.
          Failures in container launches due to issues like disk failure are difficult to diagnose Sub-task Resolved Unassigned
          5.
          Difficult to diagnose a failed container launch when error due to invalid environment variable Sub-task Closed Jian He
          6.
          Localization failures should be available in container diagnostics Sub-task Resolved Vinod Kumar Vavilapalli
          7.
          Improve exception information on AM launch crashes Sub-task Closed Li Lu
          8.
          The diagnostics is always the ExitCodeException stack when the container crashes Sub-task Closed Tsuyoshi Ozawa
          9.
          add more diags about app retry limits on AM failures Sub-task Resolved Steve Loughran
          10.
          Include command line, localization info and env vars on AM launch failure Sub-task Resolved Unassigned
          11.
          Log the command of launching containers Sub-task Resolved Jeff Zhang
          12.
          Add tail of stderr to diagnostics if container fails to launch or it container logs are empty Sub-task Resolved Unassigned
          13.
          LinuxContainerExecutor loses info when forwarding ResourceHandlerException Sub-task Resolved Bibin Chundatt
          14.
          add a way for an attempt to report an attempt failure Sub-task Open Sunil G
          15.
          create a yarn.troubleshooting log for logging app launching problems Sub-task Open Unassigned
          16.
          Add container launch related debug information to container logs when a container fails Sub-task Resolved Varun Vasudev
          17.
          Print enough container-executor logs for troubleshooting container launch failures Sub-task Open Unassigned
          18.
          Capture launch_container.sh logs to a separate log file Sub-task Resolved Suma Shivaprasad
          19.
          When an export var command fails in launch_container.sh, the full container launch should fail Sub-task Resolved Sunil G

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                vinodkv Vinod Kumar Vavilapalli
              • Votes:
                1 Vote for this issue
                Watchers:
                25 Start watching this issue

                Dates

                • Created:
                  Updated: