Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-1630

Introduce timeout for async polling operations in YarnClientImpl

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.2.0
    • 2.3.0
    • client
    • None
    • Reviewed

    Description

      I ran an MR2 application that would have been long running, and killed it programmatically using a YarnClient. The app was killed, but the client hung forever. The message that I saw, which spammed the logs, was "Watiting for application application_1389036507624_0018 to be killed."

      The RM log indicated that the app had indeed transitioned from RUNNING to KILLED, but for some reason future responses to the RPC to kill the application did not indicate that the app had been terminated.

      I tracked this down to YarnClientImpl.java, and though I was unable to reproduce the bug, I wrote a patch to introduce a bound on the number of times that YarnClientImpl retries the RPC before giving up.

      Attachments

        1. diff-1.txt
          7 kB
          Aditya Acharya
        2. diff.txt
          5 kB
          Aditya Acharya

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            adityaacharya Aditya Acharya
            adityaacharya Aditya Acharya
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment