Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4768

getAvailablePhysicalMemorySize can be inaccurate on linux

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.7.2, 3.0.0-alpha1
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: nodemanager
    • Labels:
      None
    • Environment:

      Linux

    • Target Version/s:

      Description

      Algorithm currently uses "MemFree" + "Inactive" from /proc/meminfo

      "Inactive" may not be a very good indication of how much memory can be readily freed because it contains both:

      • Pages mapped with MAP_SHARED|MAP_ANONYMOUS (regardless of whether they're being actively accessed or not. Unclear to me why this is the case...)
      • Pages mapped MAP_PRIVATE|MAP_ANONYMOUS that have not been accessed recently

      Both of these types of pages probably shouldn't be considered "Available".

      "Inactive(file)" would seem more accurate but it's not available in all kernel versions. To keep things simple, maybe just use "Inactive(file)" if available, otherwise fallback to "Inactive".

      1. YARN-4768.patch
        9 kB
        Nathan Roberts

        Issue Links

          Activity

          Hide
          nroberts Nathan Roberts added a comment -

          Patch for trunk.

          Also changed getPhysicalMemorySize() to exclude:

          • HardwareCorrupted pages - Not that uncommon.
          • HugePagesTotal * hugePageSize - probably not commonly configured on compute nodes but just in case it seems reasonable to not count these.

          Comments welcome on alternative ways to approach these.

          Show
          nroberts Nathan Roberts added a comment - Patch for trunk. Also changed getPhysicalMemorySize() to exclude: HardwareCorrupted pages - Not that uncommon. HugePagesTotal * hugePageSize - probably not commonly configured on compute nodes but just in case it seems reasonable to not count these. Comments welcome on alternative ways to approach these.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 12s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 6m 41s trunk passed
          +1 compile 6m 1s trunk passed with JDK v1.8.0_74
          +1 compile 6m 35s trunk passed with JDK v1.7.0_95
          +1 checkstyle 0m 20s trunk passed
          +1 mvnsite 0m 56s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 1m 33s trunk passed
          +1 javadoc 0m 49s trunk passed with JDK v1.8.0_74
          +1 javadoc 1m 2s trunk passed with JDK v1.7.0_95
          +1 mvninstall 0m 40s the patch passed
          +1 compile 5m 32s the patch passed with JDK v1.8.0_74
          +1 javac 5m 32s the patch passed
          +1 compile 6m 29s the patch passed with JDK v1.7.0_95
          +1 javac 6m 29s the patch passed
          +1 checkstyle 0m 20s the patch passed
          +1 mvnsite 0m 53s the patch passed
          +1 mvneclipse 0m 13s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 1m 46s the patch passed
          +1 javadoc 0m 50s the patch passed with JDK v1.8.0_74
          +1 javadoc 1m 3s the patch passed with JDK v1.7.0_95
          -1 unit 6m 26s hadoop-common in the patch failed with JDK v1.8.0_74.
          +1 unit 6m 58s hadoop-common in the patch passed with JDK v1.7.0_95.
          +1 asflicense 0m 23s Patch does not generate ASF License warnings.
          57m 2s



          Reason Tests
          JDK v1.8.0_74 Failed junit tests hadoop.ha.TestZKFailoverController



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:0ca8df7
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12792025/YARN-4768.patch
          JIRA Issue YARN-4768
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 92e6639a8ff9 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / f86850b
          Default Java 1.7.0_95
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
          findbugs v3.0.0
          unit https://builds.apache.org/job/PreCommit-YARN-Build/10724/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_74.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10724/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_74.txt
          JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10724/testReport/
          modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/10724/console
          Powered by Apache Yetus 0.2.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 12s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 41s trunk passed +1 compile 6m 1s trunk passed with JDK v1.8.0_74 +1 compile 6m 35s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 20s trunk passed +1 mvnsite 0m 56s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 1m 33s trunk passed +1 javadoc 0m 49s trunk passed with JDK v1.8.0_74 +1 javadoc 1m 2s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 40s the patch passed +1 compile 5m 32s the patch passed with JDK v1.8.0_74 +1 javac 5m 32s the patch passed +1 compile 6m 29s the patch passed with JDK v1.7.0_95 +1 javac 6m 29s the patch passed +1 checkstyle 0m 20s the patch passed +1 mvnsite 0m 53s the patch passed +1 mvneclipse 0m 13s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 1m 46s the patch passed +1 javadoc 0m 50s the patch passed with JDK v1.8.0_74 +1 javadoc 1m 3s the patch passed with JDK v1.7.0_95 -1 unit 6m 26s hadoop-common in the patch failed with JDK v1.8.0_74. +1 unit 6m 58s hadoop-common in the patch passed with JDK v1.7.0_95. +1 asflicense 0m 23s Patch does not generate ASF License warnings. 57m 2s Reason Tests JDK v1.8.0_74 Failed junit tests hadoop.ha.TestZKFailoverController Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12792025/YARN-4768.patch JIRA Issue YARN-4768 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 92e6639a8ff9 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / f86850b Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-YARN-Build/10724/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_74.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10724/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_74.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10724/testReport/ modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common Console output https://builds.apache.org/job/PreCommit-YARN-Build/10724/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
          Hide
          nroberts Nathan Roberts added a comment -

          Any comments on this approach?

          Show
          nroberts Nathan Roberts added a comment - Any comments on this approach?
          Hide
          eepayne Eric Payne added a comment -

          "Inactive(file)" would seem more accurate but it's not available in all kernel versions. To keep things simple, maybe just use "Inactive(file)" if available, otherwise fallback to "Inactive".

          Sounds reasonable. I'll take a look at the patch.

          Show
          eepayne Eric Payne added a comment - "Inactive(file)" would seem more accurate but it's not available in all kernel versions. To keep things simple, maybe just use "Inactive(file)" if available, otherwise fallback to "Inactive". Sounds reasonable. I'll take a look at the patch.
          Hide
          eepayne Eric Payne added a comment -

          The fix LGTM.
          +1

          Show
          eepayne Eric Payne added a comment - The fix LGTM. +1
          Hide
          eepayne Eric Payne added a comment -

          Thanks for the fix, Nathan Roberts. I have committed it to trunk, branch-2, and branch-2.8.

          Show
          eepayne Eric Payne added a comment - Thanks for the fix, Nathan Roberts . I have committed it to trunk, branch-2, and branch-2.8.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #9741 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9741/)
          YARN-4768. getAvailablePhysicalMemorySize can be inaccurate on linux. (epayne: rev 6b1c1cb01cbf979f46cd3ea9308b7745c5595b4f)

          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/SysInfoLinux.java
          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestSysInfoLinux.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #9741 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9741/ ) YARN-4768 . getAvailablePhysicalMemorySize can be inaccurate on linux. (epayne: rev 6b1c1cb01cbf979f46cd3ea9308b7745c5595b4f) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/SysInfoLinux.java hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestSysInfoLinux.java

            People

            • Assignee:
              nroberts Nathan Roberts
              Reporter:
              nroberts Nathan Roberts
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development