Details

    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      The datanode now performs 4MB readahead by default when reading data from its disks, if the native libraries are present. This has been shown to improve performance in many workloads. The feature may be disabled by setting dfs.datanode.readahead.bytes to "0".
      Show
      The datanode now performs 4MB readahead by default when reading data from its disks, if the native libraries are present. This has been shown to improve performance in many workloads. The feature may be disabled by setting dfs.datanode.readahead.bytes to "0".

      Description

      The fadvise features have been implemented for some time, and we've enabled them in production at a lot of customer sites without difficulty. I'd like to enable the readahead feature by default in future versions so that users get this benefit without any manual configuration required.

      The other fadvise features seem to be more workload-dependent and need further testing before enabling by default.

      1. hdfs-3697.txt
        5 kB
        Todd Lipcon
      2. hdfs-3697.txt
        5 kB
        Todd Lipcon
      3. hdfs-3697-branch-1.txt
        4 kB
        Todd Lipcon

        Issue Links

          Activity

          Hide
          Todd Lipcon added a comment -

          Attached patch enables the readahead to 4MB by default. Experimentally we've determined this provides a good performance boost without too high an increase in buffer cache usage.

          I also took the liberty of adding documentation for the other fadvise parameters to hdfs-default.xml in this patch.

          Show
          Todd Lipcon added a comment - Attached patch enables the readahead to 4MB by default. Experimentally we've determined this provides a good performance boost without too high an increase in buffer cache usage. I also took the liberty of adding documentation for the other fadvise parameters to hdfs-default.xml in this patch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12537392/hdfs-3697.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 javac. The patch appears to cause the build to fail.

          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2879//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12537392/hdfs-3697.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javac. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2879//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          Oops, typo in the XML made it malformed. New one passes xmllint.

          Show
          Todd Lipcon added a comment - Oops, typo in the XML made it malformed. New one passes xmllint.
          Hide
          Eli Collins added a comment -

          +1 pending jenkins. It's worth calling out in the description that this only enables readahead, not drop cache and sync.

          Show
          Eli Collins added a comment - +1 pending jenkins. It's worth calling out in the description that this only enables readahead, not drop cache and sync.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12537405/hdfs-3697.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS
          org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2880//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2880//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12537405/hdfs-3697.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2880//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2880//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          Committed to trunk and branch-2. Here is a branch-1 patch. What do folks think about backporting?

          Show
          Todd Lipcon added a comment - Committed to trunk and branch-2. Here is a branch-1 patch. What do folks think about backporting?
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #2576 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2576/)
          HDFS-3697. Enable fadvise readahead by default. Contributed by Todd Lipcon. (Revision 1364698)

          Result = SUCCESS
          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1364698
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #2576 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2576/ ) HDFS-3697 . Enable fadvise readahead by default. Contributed by Todd Lipcon. (Revision 1364698) Result = SUCCESS todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1364698 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #2511 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2511/)
          HDFS-3697. Enable fadvise readahead by default. Contributed by Todd Lipcon. (Revision 1364698)

          Result = SUCCESS
          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1364698
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #2511 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2511/ ) HDFS-3697 . Enable fadvise readahead by default. Contributed by Todd Lipcon. (Revision 1364698) Result = SUCCESS todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1364698 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12537579/hdfs-3697-branch-1.txt
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2887//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12537579/hdfs-3697-branch-1.txt against trunk revision . -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2887//console This message is automatically generated.
          Hide
          Brandon Li added a comment -

          Do we have any good way to enable/disable readahead, drop cache and sync without restarting namenode?
          Just thinking when the workload changes, the customer has to restart the namenode to take advantage of the performance enhancement.

          Show
          Brandon Li added a comment - Do we have any good way to enable/disable readahead, drop cache and sync without restarting namenode? Just thinking when the workload changes, the customer has to restart the namenode to take advantage of the performance enhancement.
          Hide
          Todd Lipcon added a comment -

          You mean the datanode, right?

          It would be worth considering adding these as per-stream flags - e.g an HDFS "fadvise"-like API on DFSInputStream/DFSOutputStream that changes the behavior. Then, the datanode-wide flags would just serve as defaults.

          Show
          Todd Lipcon added a comment - You mean the datanode, right? It would be worth considering adding these as per-stream flags - e.g an HDFS "fadvise"-like API on DFSInputStream/DFSOutputStream that changes the behavior. Then, the datanode-wide flags would just serve as defaults.
          Hide
          Brandon Li added a comment -

          [quote]You mean the datanode, right?[quote]
          ops, yes.

          Show
          Brandon Li added a comment - [quote] You mean the datanode, right? [quote] ops, yes.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #2532 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2532/)
          HDFS-3697. Enable fadvise readahead by default. Contributed by Todd Lipcon. (Revision 1364698)

          Result = FAILURE
          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1364698
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #2532 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2532/ ) HDFS-3697 . Enable fadvise readahead by default. Contributed by Todd Lipcon. (Revision 1364698) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1364698 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
          Hide
          Todd Lipcon added a comment -

          Hey Brandon. In the absence of the improvement discussed above, do you think we should commit to branch-1? I ask, given you've been doing the other fadvise backports into branch-1.

          Show
          Todd Lipcon added a comment - Hey Brandon. In the absence of the improvement discussed above, do you think we should commit to branch-1? I ask, given you've been doing the other fadvise backports into branch-1.
          Hide
          Brandon Li added a comment -

          Hey Todd, I've seen some performance improvement in my branch-1 test with the fadvise support enabled. I think should commit it to branch-1.

          The thing I am not sure is the readahead step size since I didn't try out with enough workloads and step sizes in my tests. Given 4MB is good in your experiments and it's also configurable, we can start with it.

          I just feel it's not trivial sometimes to guess out a good step size beforehand. If it's possible in the future, we might want to make readahead adjust its step size based on observed access pattern(however, tracking multiple parallel streams' access pattern can be same challenging ).

          Show
          Brandon Li added a comment - Hey Todd, I've seen some performance improvement in my branch-1 test with the fadvise support enabled. I think should commit it to branch-1. The thing I am not sure is the readahead step size since I didn't try out with enough workloads and step sizes in my tests. Given 4MB is good in your experiments and it's also configurable, we can start with it. I just feel it's not trivial sometimes to guess out a good step size beforehand. If it's possible in the future, we might want to make readahead adjust its step size based on observed access pattern(however, tracking multiple parallel streams' access pattern can be same challenging ).
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #1114 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1114/)
          HDFS-3697. Enable fadvise readahead by default. Contributed by Todd Lipcon. (Revision 1364698)

          Result = SUCCESS
          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1364698
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1114 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1114/ ) HDFS-3697 . Enable fadvise readahead by default. Contributed by Todd Lipcon. (Revision 1364698) Result = SUCCESS todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1364698 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #1146 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1146/)
          HDFS-3697. Enable fadvise readahead by default. Contributed by Todd Lipcon. (Revision 1364698)

          Result = FAILURE
          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1364698
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1146 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1146/ ) HDFS-3697 . Enable fadvise readahead by default. Contributed by Todd Lipcon. (Revision 1364698) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1364698 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
          Hide
          Suresh Srinivas added a comment -

          Todd, I think we should add appropriate release notes, even though this is no incompatible, to call out attention to enabling this feature by default.

          BTW this is a good change!

          Show
          Suresh Srinivas added a comment - Todd, I think we should add appropriate release notes, even though this is no incompatible, to call out attention to enabling this feature by default. BTW this is a good change!
          Hide
          Eli Collins added a comment -

          Agree with Brandon that this seems reasonable for branch-1, I'll merge the patch.

          Show
          Eli Collins added a comment - Agree with Brandon that this seems reasonable for branch-1, I'll merge the patch.
          Hide
          Eli Collins added a comment -

          +1 to Todd's branch-1 patch.

          Show
          Eli Collins added a comment - +1 to Todd's branch-1 patch.
          Hide
          Eli Collins added a comment -

          I've merged this.

          Show
          Eli Collins added a comment - I've merged this.

            People

            • Assignee:
              Todd Lipcon
              Reporter:
              Todd Lipcon
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development