Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-10794

Submitting jobs to a single subcluster will fail while AMRMProxy is enabled

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 3.2.1
    • None
    • nodemanager
    • None

    Description

      Sorry for not knowing how to quote a issue...

      https://issues.apache.org/jira/browse/YARN-9693

      This issue has already raised this problem, but it seems that I can't submit job by the federation client while using the patch.

      The original reason of this problem is that NM will set a local AMRMToken for AM if AMRMProxy is enabled, so that AM will fail if it contact with RM directly.

      This problem makes it impossible to rolling upgrade to federation, cause we can't upgrade all the NMs and clients at one moment

      So I developed another patch, using this patch I can submit jobs via the both ways.

      My solution is that hold two tokens at the same time, and choose a right one during the building of RPC Client.

      I tested this patch in some situations like AM recover, NM recover, no error found.

      But still, I can't ensure this patch is good, so i wonder if there is a better solution.

       

      Attachments

        1. YARN-10794.v2.patch
          3 kB
          Song Jiacheng
        2. YARN-10794.v1.patch
          3 kB
          Song Jiacheng

        Activity

          People

            Unassigned Unassigned
            Song Jiacheng Song Jiacheng
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: