Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6678

Allow ShuffleHandler readahead without drop-behind

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.7.2, 3.0.0-alpha1
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: nodemanager
    • Labels:
      None

      Description

      Currently mapreduce.shuffle.manage.os.cache enables/disables both readahead (POSIX_FADV_WILLNEED) and drop-behind (POSIX_FADV_DONTNEED) logic within the ShuffleHandler.

      It would be beneficial if these were separately configurable.

      • Running without readahead can lead to significant seek storms caused by large numbers of sendfiles() competing with one another.
      • However, running with drop-behind can also lead to seek storms because there are cases where the server can successfully write the shuffle bytes to the network, BUT the client doesn't want the bytes right now (MergeManager wants to WAIT is an example) so it ignores them and asks for them again a bit later. This causes repeated reads of the same data from disk.

      I'll attach a simple patch that enables/disables readahead based on mapreduce.shuffle.readahead.bytes==0, leaving mapreduce.shuffle.manage.os.cache controlling only the drop-behind.

        Activity

        Hide
        nroberts Nathan Roberts added a comment -

        Tested this patch on a 10-node cluster using terasort. Verified using strace that nodemanager is issuing correct WILLNEED without DONTNEED.

        Show
        nroberts Nathan Roberts added a comment - Tested this patch on a 10-node cluster using terasort. Verified using strace that nodemanager is issuing correct WILLNEED without DONTNEED.
        Hide
        leftnoteasy Wangda Tan added a comment -

        Should this be moved to MAPREDUCE project?

        Show
        leftnoteasy Wangda Tan added a comment - Should this be moved to MAPREDUCE project?
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 8s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
        +1 mvninstall 6m 31s trunk passed
        +1 compile 0m 12s trunk passed with JDK v1.8.0_77
        +1 compile 0m 16s trunk passed with JDK v1.7.0_95
        +1 checkstyle 0m 12s trunk passed
        +1 mvnsite 0m 19s trunk passed
        +1 mvneclipse 0m 13s trunk passed
        +1 findbugs 0m 26s trunk passed
        +1 javadoc 0m 12s trunk passed with JDK v1.8.0_77
        +1 javadoc 0m 14s trunk passed with JDK v1.7.0_95
        +1 mvninstall 0m 15s the patch passed
        +1 compile 0m 9s the patch passed with JDK v1.8.0_77
        +1 javac 0m 9s the patch passed
        +1 compile 0m 13s the patch passed with JDK v1.7.0_95
        +1 javac 0m 13s the patch passed
        +1 checkstyle 0m 10s the patch passed
        +1 mvnsite 0m 17s the patch passed
        +1 mvneclipse 0m 10s the patch passed
        +1 whitespace 0m 0s Patch has no whitespace issues.
        +1 findbugs 0m 35s the patch passed
        +1 javadoc 0m 9s the patch passed with JDK v1.8.0_77
        +1 javadoc 0m 12s the patch passed with JDK v1.7.0_95
        +1 unit 0m 15s hadoop-mapreduce-client-shuffle in the patch passed with JDK v1.8.0_77.
        +1 unit 0m 18s hadoop-mapreduce-client-shuffle in the patch passed with JDK v1.7.0_95.
        +1 asflicense 0m 16s Patch does not generate ASF License warnings.
        12m 39s



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:fbe3e86
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12798991/YARN-4964.001.patch
        JIRA Issue MAPREDUCE-6678
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 16ba563e405b 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / fdbafbc
        Default Java 1.7.0_95
        Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_77 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
        findbugs v3.0.0
        JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6436/testReport/
        modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle
        Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6436/console
        Powered by Apache Yetus 0.2.0 http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 8s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 6m 31s trunk passed +1 compile 0m 12s trunk passed with JDK v1.8.0_77 +1 compile 0m 16s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 12s trunk passed +1 mvnsite 0m 19s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 0m 26s trunk passed +1 javadoc 0m 12s trunk passed with JDK v1.8.0_77 +1 javadoc 0m 14s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 15s the patch passed +1 compile 0m 9s the patch passed with JDK v1.8.0_77 +1 javac 0m 9s the patch passed +1 compile 0m 13s the patch passed with JDK v1.7.0_95 +1 javac 0m 13s the patch passed +1 checkstyle 0m 10s the patch passed +1 mvnsite 0m 17s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 0m 35s the patch passed +1 javadoc 0m 9s the patch passed with JDK v1.8.0_77 +1 javadoc 0m 12s the patch passed with JDK v1.7.0_95 +1 unit 0m 15s hadoop-mapreduce-client-shuffle in the patch passed with JDK v1.8.0_77. +1 unit 0m 18s hadoop-mapreduce-client-shuffle in the patch passed with JDK v1.7.0_95. +1 asflicense 0m 16s Patch does not generate ASF License warnings. 12m 39s Subsystem Report/Notes Docker Image:yetus/hadoop:fbe3e86 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12798991/YARN-4964.001.patch JIRA Issue MAPREDUCE-6678 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 16ba563e405b 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / fdbafbc Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_77 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6436/testReport/ modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6436/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
        Hide
        eepayne Eric Payne added a comment -

        Thanks, Nathan Roberts, for raising this issue and providing a patch.

        Tested this patch on a 10-node cluster using terasort. Verified using strace that nodemanager is issuing correct WILLNEED without DONTNEED.

        I recognize that it's difficult to produce a unit test for the patch. Would it be possible for you to post a very brief justification of that?

        Otherwise, patch looks good to me.
        +1

        Show
        eepayne Eric Payne added a comment - Thanks, Nathan Roberts , for raising this issue and providing a patch. Tested this patch on a 10-node cluster using terasort. Verified using strace that nodemanager is issuing correct WILLNEED without DONTNEED. I recognize that it's difficult to produce a unit test for the patch. Would it be possible for you to post a very brief justification of that? Otherwise, patch looks good to me. +1
        Hide
        nroberts Nathan Roberts added a comment -

        I recognize that it's difficult to produce a unit test for the patch. Would it be possible for you to post a very brief justification of that?

        The approach I took was to test it manually because it's invasive to determine whether or not the OS is actually doing the readahead from java. Rather than create a fragile test, I opted to use strace to verify the fadvise(WILL_NEED) occurs when the configured readahead size is >0, and does not occur when 0; regardless of whether mapreduce.shuffle.manage.os.cache is enabled.

        Show
        nroberts Nathan Roberts added a comment - I recognize that it's difficult to produce a unit test for the patch. Would it be possible for you to post a very brief justification of that? The approach I took was to test it manually because it's invasive to determine whether or not the OS is actually doing the readahead from java. Rather than create a fragile test, I opted to use strace to verify the fadvise(WILL_NEED) occurs when the configured readahead size is >0, and does not occur when 0; regardless of whether mapreduce.shuffle.manage.os.cache is enabled.
        Hide
        eepayne Eric Payne added a comment -

        Thanks Nathan Roberts. I committed these changes to trunk, branch-2, and branch-2.8.

        Show
        eepayne Eric Payne added a comment - Thanks Nathan Roberts . I committed these changes to trunk, branch-2, and branch-2.8.
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-trunk-Commit #9739 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9739/)
        MAPREDUCE-6678. Allow ShuffleHandler readahead without drop-behind. (epayne: rev cd35b692de88e3afe7f41405da635c3fbd9b4650)

        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/FadvisedFileRegion.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #9739 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9739/ ) MAPREDUCE-6678 . Allow ShuffleHandler readahead without drop-behind. (epayne: rev cd35b692de88e3afe7f41405da635c3fbd9b4650) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/FadvisedFileRegion.java

          People

          • Assignee:
            nroberts Nathan Roberts
            Reporter:
            nroberts Nathan Roberts
          • Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development