Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-7684

The Total Memory and VCores display in the yarn UI is not correct with labeled node

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • 2.7.3
    • 2.8.0
    • webapp
    • None

    Description

      The Total Memory and VCores display in the yarn UI is not correct with labeled node

      recreate steps:
      1. should have a hadoop cluster
      2. enabled the yarn Node Labels feature
      3. create a label eg: yarn rmadmin -addToClusterNodeLabels "test"
      4. add a node into the label eg: yarn rmadmin -replaceLabelsOnNode "zhaoyim02.com=test"
      5. then go to the yarn UI http://<hostname>:8088/cluster/nodes

      Attachments

        1. TestResultsAfterPortingYARN-4484.pdf
          545 kB
          Zhao Yi Ming
        2. yarn_issue.pdf
          183 kB
          Zhao Yi Ming
        3. YARN-7684-branch-2.7.3.001.patch
          1 kB
          Zhao Yi Ming

        Issue Links

          Activity

            zhaoyim Zhao Yi Ming added a comment -

            I already fix this issue, I need to assign this issue to myself.

            zhaoyim Zhao Yi Ming added a comment - I already fix this issue, I need to assign this issue to myself.
            sunilg Sunil G added a comment -

            I added to you as a contributor.

            sunilg Sunil G added a comment - I added to you as a contributor.
            zhaoyim Zhao Yi Ming added a comment -

            @Sunil G Thanks! This is my first try to contribute to hadoop community, once I submit the patch, I will add you as the reviewer.

            zhaoyim Zhao Yi Ming added a comment - @Sunil G Thanks! This is my first try to contribute to hadoop community, once I submit the patch, I will add you as the reviewer.
            githubbot ASF GitHub Bot added a comment -

            GitHub user zhaoyim opened a pull request:

            https://github.com/apache/hadoop/pull/320

            YARN-7684. Fix The Total Memory and VCores display in the yarn UI is …

            The Total Memory and VCores display in the yarn UI is not correct with labeled node

            Use the cluster resource memory and Vcores info instead of the root queue metrics, availableMB + allocatedMB. and availableVirtualCores + allocatedVirtualCores.

            You can merge this pull request into a Git repository by running:

            $ git pull https://github.com/zhaoyim/hadoop branch-2.7.3

            Alternatively you can review and apply these changes as the patch at:

            https://github.com/apache/hadoop/pull/320.patch

            To close this pull request, make a commit to your master/trunk branch
            with (at least) the following in the commit message:

            This closes #320


            commit fcec8fd9bde074dda399d56dd426d4fd9aa9c55f
            Author: zhaoyim <yyyiming@...>
            Date: 2017-12-28T15:53:22Z

            YARN-7684. Fix The Total Memory and VCores display in the yarn UI is not correct with labeled node


            githubbot ASF GitHub Bot added a comment - GitHub user zhaoyim opened a pull request: https://github.com/apache/hadoop/pull/320 YARN-7684 . Fix The Total Memory and VCores display in the yarn UI is … The Total Memory and VCores display in the yarn UI is not correct with labeled node Use the cluster resource memory and Vcores info instead of the root queue metrics, availableMB + allocatedMB. and availableVirtualCores + allocatedVirtualCores. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zhaoyim/hadoop branch-2.7.3 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hadoop/pull/320.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #320 commit fcec8fd9bde074dda399d56dd426d4fd9aa9c55f Author: zhaoyim <yyyiming@...> Date: 2017-12-28T15:53:22Z YARN-7684 . Fix The Total Memory and VCores display in the yarn UI is not correct with labeled node
            zhaoyim Zhao Yi Ming added a comment -

            sunilg I create a pr for the fix. Could you help review the code? Thanks!

            Also I saw this problem also happened on version > 2.7.3. Will I need to porting this fix into other upstreams? Thanks!

            zhaoyim Zhao Yi Ming added a comment - sunilg I create a pr for the fix. Could you help review the code? Thanks! Also I saw this problem also happened on version > 2.7.3. Will I need to porting this fix into other upstreams? Thanks!
            genericqa genericqa added a comment -
            -1 overall



            Vote Subsystem Runtime Comment
            0 reexec 0m 0s Docker mode activated.
            -1 docker 7m 50s Docker failed to build yetus/hadoop:c420dfe.



            Subsystem Report/Notes
            JIRA Issue YARN-7684
            GITHUB PR https://github.com/apache/hadoop/pull/320
            Console output https://builds.apache.org/job/PreCommit-YARN-Build/19054/console
            Powered by Apache Yetus 0.7.0-SNAPSHOT http://yetus.apache.org

            This message was automatically generated.

            genericqa genericqa added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. -1 docker 7m 50s Docker failed to build yetus/hadoop:c420dfe. Subsystem Report/Notes JIRA Issue YARN-7684 GITHUB PR https://github.com/apache/hadoop/pull/320 Console output https://builds.apache.org/job/PreCommit-YARN-Build/19054/console Powered by Apache Yetus 0.7.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
            sunilg Sunil G added a comment -

            Yes. Using cluster resource might be solving this temporary.
            However we have partitions and hence cluster resource is not correct when we view total resource per label.

            {{ this.totalMB = availableMB + allocatedMB;}}
            For root queue, ideally we should be able to sum up to parent root level. For label, I think we have not correctly calculating queue metrics. YARN-4484 has handled this case, however if its not working now, we need to see how it got broken.

            sunilg Sunil G added a comment - Yes. Using cluster resource might be solving this temporary. However we have partitions and hence cluster resource is not correct when we view total resource per label. {{ this.totalMB = availableMB + allocatedMB;}} For root queue, ideally we should be able to sum up to parent root level. For label, I think we have not correctly calculating queue metrics. YARN-4484 has handled this case, however if its not working now, we need to see how it got broken.
            zhaoyim Zhao Yi Ming added a comment -

            sunilg
            Thanks for your update! You are correct, when we view total resource per label, the cluster resource is not correct.

            And please ignore my comment about the issue happened on version > 2.7.3. This is my mistake to understand. Sorry for the wrong comments!

            I noticed the YARN-4484 fix target release is 2.8.0, we did not try the 2.8.0 release, we just found the problem on 2.7.1 and 2.7.3. So next we will try your fix on 2.8.0. Any results I will update here. Thanks!

            zhaoyim Zhao Yi Ming added a comment - sunilg Thanks for your update! You are correct, when we view total resource per label, the cluster resource is not correct. And please ignore my comment about the issue happened on version > 2.7.3. This is my mistake to understand. Sorry for the wrong comments! I noticed the YARN-4484 fix target release is 2.8.0, we did not try the 2.8.0 release, we just found the problem on 2.7.1 and 2.7.3. So next we will try your fix on 2.8.0. Any results I will update here. Thanks!
            zhaoyim Zhao Yi Ming added a comment -

            sunilg
            I porting the YARN-4484 patch back to 2.7.3 and test the fix in my 2.7.3 cluster, The total mem and cpus are show correct. And after submit jobs, they are still show correct. But in my understand view total resource per label is not correct. I paste my test results TestResultsAfterPortingYARN-4484.pdf in the attachment. Could you have a look? Thanks a lot!

            zhaoyim Zhao Yi Ming added a comment - sunilg I porting the YARN-4484 patch back to 2.7.3 and test the fix in my 2.7.3 cluster, The total mem and cpus are show correct. And after submit jobs, they are still show correct. But in my understand view total resource per label is not correct. I paste my test results TestResultsAfterPortingYARN-4484.pdf in the attachment. Could you have a look? Thanks a lot!
            sunilg Sunil G added a comment -

            I will say that clicking on NodeLabels link on nodes count wont redirect to show per label resource. Its not designed in that way. We were planning to show total cluster resource.
            And in Node Labels page, we can see per label whats the total resource on last column.

            sunilg Sunil G added a comment - I will say that clicking on NodeLabels link on nodes count wont redirect to show per label resource. Its not designed in that way. We were planning to show total cluster resource. And in Node Labels page, we can see per label whats the total resource on last column.
            zhaoyim Zhao Yi Ming added a comment -

            sunilg Thanks for clarify! If so, I will close this issue. Thanks again for you time!

            zhaoyim Zhao Yi Ming added a comment - sunilg Thanks for clarify! If so, I will close this issue. Thanks again for you time!
            githubbot ASF GitHub Bot added a comment -

            Github user zhaoyim closed the pull request at:

            https://github.com/apache/hadoop/pull/320

            githubbot ASF GitHub Bot added a comment - Github user zhaoyim closed the pull request at: https://github.com/apache/hadoop/pull/320
            sunilg Sunil G added a comment -

            Thanks zhaoyim I re-opened to close this as Wont fix. and you could provide 2.7 patch in YARN-4484 to include in next release.

            sunilg Sunil G added a comment - Thanks zhaoyim I re-opened to close this as Wont fix. and you could provide 2.7 patch in YARN-4484 to include in next release.
            sunilg Sunil G added a comment -

            This is already covered in 2.8+ version. YARN-4484 could be backported to 2.7 as needed.

            sunilg Sunil G added a comment - This is already covered in 2.8+ version. YARN-4484 could be backported to 2.7 as needed.
            zhaoyim Zhao Yi Ming added a comment -

            sunilg Sorry for trouble you again! You said I can create a 2.7 patch include the YARN-4484 fix, but I am not sure how to do it. Because in the HowToContribute doc I saw, it said "If you are unsure of the target branch then trunk is usually the best choice. Committers will usually merge the patch to downstream branches." I guess the YARN-4484 fix already include in the trunk branch, so only the committers can merge the patch to 2.7 branch, right? Thanks!

            zhaoyim Zhao Yi Ming added a comment - sunilg Sorry for trouble you again! You said I can create a 2.7 patch include the YARN-4484 fix, but I am not sure how to do it. Because in the HowToContribute doc I saw, it said "If you are unsure of the target branch then trunk is usually the best choice. Committers will usually merge the patch to downstream branches." I guess the YARN-4484 fix already include in the trunk branch, so only the committers can merge the patch to 2.7 branch, right? Thanks!

            People

              zhaoyim Zhao Yi Ming
              zhaoyim Zhao Yi Ming
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: