Uploaded image for project: 'Samza'
  1. Samza
  2. SAMZA-1116

Yarn RM recovery causing duplicate containers

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.11.0
    • None
    • None
    • None

    Description

      To replicate:

      1. Make sure that Yarn RM recovery is enabled
      2. Deploy a test job
      3. Terminate Yarn RM
      4. Wait until AM of the job terminate with:
        2017-02-02 13:08:04 RetryInvocationHandler [WARN] Exception while invoking class org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.finishApplicationMaster over rm2. Not retrying because failovers (30) exceeded maximum allowed (30)
        
      5. Restart RM

      The job should get a new attempt but the old containers will not be terminated, causing duplicate containers to run.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              danil Danil Serdyuchenko
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: