Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4768

getAvailablePhysicalMemorySize can be inaccurate on linux

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.7.2, 3.0.0-alpha1
    • 2.8.0, 3.0.0-alpha1
    • nodemanager
    • None
    • Linux

    Description

      Algorithm currently uses "MemFree" + "Inactive" from /proc/meminfo

      "Inactive" may not be a very good indication of how much memory can be readily freed because it contains both:

      • Pages mapped with MAP_SHARED|MAP_ANONYMOUS (regardless of whether they're being actively accessed or not. Unclear to me why this is the case...)
      • Pages mapped MAP_PRIVATE|MAP_ANONYMOUS that have not been accessed recently

      Both of these types of pages probably shouldn't be considered "Available".

      "Inactive(file)" would seem more accurate but it's not available in all kernel versions. To keep things simple, maybe just use "Inactive(file)" if available, otherwise fallback to "Inactive".

      Attachments

        1. YARN-4768.patch
          9 kB
          Nathan Roberts

        Issue Links

          Activity

            nroberts Nathan Roberts added a comment -

            Patch for trunk.

            Also changed getPhysicalMemorySize() to exclude:

            • HardwareCorrupted pages - Not that uncommon.
            • HugePagesTotal * hugePageSize - probably not commonly configured on compute nodes but just in case it seems reasonable to not count these.

            Comments welcome on alternative ways to approach these.

            nroberts Nathan Roberts added a comment - Patch for trunk. Also changed getPhysicalMemorySize() to exclude: HardwareCorrupted pages - Not that uncommon. HugePagesTotal * hugePageSize - probably not commonly configured on compute nodes but just in case it seems reasonable to not count these. Comments welcome on alternative ways to approach these.
            hadoopqa Hadoop QA added a comment -
            -1 overall



            Vote Subsystem Runtime Comment
            0 reexec 0m 12s Docker mode activated.
            +1 @author 0m 0s The patch does not contain any @author tags.
            +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
            +1 mvninstall 6m 41s trunk passed
            +1 compile 6m 1s trunk passed with JDK v1.8.0_74
            +1 compile 6m 35s trunk passed with JDK v1.7.0_95
            +1 checkstyle 0m 20s trunk passed
            +1 mvnsite 0m 56s trunk passed
            +1 mvneclipse 0m 14s trunk passed
            +1 findbugs 1m 33s trunk passed
            +1 javadoc 0m 49s trunk passed with JDK v1.8.0_74
            +1 javadoc 1m 2s trunk passed with JDK v1.7.0_95
            +1 mvninstall 0m 40s the patch passed
            +1 compile 5m 32s the patch passed with JDK v1.8.0_74
            +1 javac 5m 32s the patch passed
            +1 compile 6m 29s the patch passed with JDK v1.7.0_95
            +1 javac 6m 29s the patch passed
            +1 checkstyle 0m 20s the patch passed
            +1 mvnsite 0m 53s the patch passed
            +1 mvneclipse 0m 13s the patch passed
            +1 whitespace 0m 0s Patch has no whitespace issues.
            +1 findbugs 1m 46s the patch passed
            +1 javadoc 0m 50s the patch passed with JDK v1.8.0_74
            +1 javadoc 1m 3s the patch passed with JDK v1.7.0_95
            -1 unit 6m 26s hadoop-common in the patch failed with JDK v1.8.0_74.
            +1 unit 6m 58s hadoop-common in the patch passed with JDK v1.7.0_95.
            +1 asflicense 0m 23s Patch does not generate ASF License warnings.
            57m 2s



            Reason Tests
            JDK v1.8.0_74 Failed junit tests hadoop.ha.TestZKFailoverController



            Subsystem Report/Notes
            Docker Image:yetus/hadoop:0ca8df7
            JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12792025/YARN-4768.patch
            JIRA Issue YARN-4768
            Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
            uname Linux 92e6639a8ff9 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
            Build tool maven
            Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
            git revision trunk / f86850b
            Default Java 1.7.0_95
            Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
            findbugs v3.0.0
            unit https://builds.apache.org/job/PreCommit-YARN-Build/10724/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_74.txt
            unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10724/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_74.txt
            JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10724/testReport/
            modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
            Console output https://builds.apache.org/job/PreCommit-YARN-Build/10724/console
            Powered by Apache Yetus 0.2.0 http://yetus.apache.org

            This message was automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 12s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 41s trunk passed +1 compile 6m 1s trunk passed with JDK v1.8.0_74 +1 compile 6m 35s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 20s trunk passed +1 mvnsite 0m 56s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 1m 33s trunk passed +1 javadoc 0m 49s trunk passed with JDK v1.8.0_74 +1 javadoc 1m 2s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 40s the patch passed +1 compile 5m 32s the patch passed with JDK v1.8.0_74 +1 javac 5m 32s the patch passed +1 compile 6m 29s the patch passed with JDK v1.7.0_95 +1 javac 6m 29s the patch passed +1 checkstyle 0m 20s the patch passed +1 mvnsite 0m 53s the patch passed +1 mvneclipse 0m 13s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 1m 46s the patch passed +1 javadoc 0m 50s the patch passed with JDK v1.8.0_74 +1 javadoc 1m 3s the patch passed with JDK v1.7.0_95 -1 unit 6m 26s hadoop-common in the patch failed with JDK v1.8.0_74. +1 unit 6m 58s hadoop-common in the patch passed with JDK v1.7.0_95. +1 asflicense 0m 23s Patch does not generate ASF License warnings. 57m 2s Reason Tests JDK v1.8.0_74 Failed junit tests hadoop.ha.TestZKFailoverController Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12792025/YARN-4768.patch JIRA Issue YARN-4768 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 92e6639a8ff9 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / f86850b Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-YARN-Build/10724/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_74.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10724/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_74.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10724/testReport/ modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common Console output https://builds.apache.org/job/PreCommit-YARN-Build/10724/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
            nroberts Nathan Roberts added a comment -

            Any comments on this approach?

            nroberts Nathan Roberts added a comment - Any comments on this approach?
            epayne Eric Payne added a comment -

            "Inactive(file)" would seem more accurate but it's not available in all kernel versions. To keep things simple, maybe just use "Inactive(file)" if available, otherwise fallback to "Inactive".

            Sounds reasonable. I'll take a look at the patch.

            epayne Eric Payne added a comment - "Inactive(file)" would seem more accurate but it's not available in all kernel versions. To keep things simple, maybe just use "Inactive(file)" if available, otherwise fallback to "Inactive". Sounds reasonable. I'll take a look at the patch.
            epayne Eric Payne added a comment -

            The fix LGTM.
            +1

            epayne Eric Payne added a comment - The fix LGTM. +1
            epayne Eric Payne added a comment -

            Thanks for the fix, nroberts. I have committed it to trunk, branch-2, and branch-2.8.

            epayne Eric Payne added a comment - Thanks for the fix, nroberts . I have committed it to trunk, branch-2, and branch-2.8.
            hudson Hudson added a comment -

            FAILURE: Integrated in Hadoop-trunk-Commit #9741 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9741/)
            YARN-4768. getAvailablePhysicalMemorySize can be inaccurate on linux. (epayne: rev 6b1c1cb01cbf979f46cd3ea9308b7745c5595b4f)

            • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/SysInfoLinux.java
            • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestSysInfoLinux.java
            hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #9741 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9741/ ) YARN-4768 . getAvailablePhysicalMemorySize can be inaccurate on linux. (epayne: rev 6b1c1cb01cbf979f46cd3ea9308b7745c5595b4f) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/SysInfoLinux.java hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestSysInfoLinux.java

            People

              nroberts Nathan Roberts
              nroberts Nathan Roberts
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: