Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-7835

[Atsv2] Race condition in NM while publishing events if second attempt is launched on the same node

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 3.1.0, 2.10.0, 3.0.4
    • None
    • None
    • Reviewed

    Description

      It is observed race condition that if master container is killed for some reason and launched on same node then NMTimelinePublisher doesn't add timelineClient. But once completed container for 1st attempt has come then NMTimelinePublisher removes the timelineClient.
      It causes all subsequent event publishing from different client fails to publish with exception Application is not found. !

      Attachments

        1. YARN-7835.001.patch
          9 kB
          Rohith Sharma K S
        2. YARN-7835.002.patch
          10 kB
          Rohith Sharma K S
        3. YARN-7835.003.patch
          11 kB
          Rohith Sharma K S
        4. YARN-7835.004.patch
          11 kB
          Rohith Sharma K S

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            rohithsharma Rohith Sharma K S
            rohithsharma Rohith Sharma K S
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment