Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-8046

Allow better control of getContentSummary

Details

    • Reviewed

    Description

      On busy clusters, users performing quota checks against a big directory structure can affect the namenode performance. It has become a lot better after HDFS-4995, but as clusters get bigger and busier, it is apparent that we need finer grain control to avoid long read lock causing throughput drop.

      Even with unfair namesystem lock setting, a long read lock (10s of milliseconds) can starve many readers and especially writers. So the locking duration should be reduced, which can be done by imposing a lower count-per-iteration limit in the existing implementation. But HDFS-4995 came with a fixed amount of sleep between locks. This needs to be made configurable, so that getContentSummary() doesn't get exceedingly slow.

      Attachments

        1. HDFS-8046.v1.patch
          6 kB
          Kihwal Lee
        2. HDFS-8046-branch-2.6.1.txt
          6 kB
          Vinod Kumar Vavilapalli

        Issue Links

          Activity

            kihwal Kihwal Lee added a comment -

            I will post a patch that we applied in our systems. Many clusters have been running with it for some time now.

            kihwal Kihwal Lee added a comment - I will post a patch that we applied in our systems. Many clusters have been running with it for some time now.
            hadoopqa Hadoop QA added a comment -

            -1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12709012/HDFS-8046.v1.patch
            against trunk revision 867d5d2.

            +1 @author. The patch does not contain any @author tags.

            -1 tests included. The patch doesn't appear to include any new or modified tests.
            Please justify why no new tests are needed for this patch.
            Also please list what manual steps were performed to verify this patch.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 javadoc. There were no new javadoc warning messages.

            +1 eclipse:eclipse. The patch built with eclipse:eclipse.

            +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

            org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
            org.apache.hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits

            Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/10164//testReport/
            Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/10164//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12709012/HDFS-8046.v1.patch against trunk revision 867d5d2. +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA org.apache.hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/10164//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/10164//console This message is automatically generated.
            kihwal Kihwal Lee added a comment -

            Test failures are not related. No new test case added since it only changes the sleep time to be configurable. It's hard to add a meaningful test for this.

            • TestRetryCacheWithHA - HDFS-7524
            • TestFailureToReadEdits#testFailureToReadEditsOnTransitionToActive[0] - After the test case timed out on a new checkpoint, the mini dfs cluster shutdown. But the standby was actually done with checkpointing and uploading the image at that point. The SBN said "Exception during image upload..." on shutdown. So it looks like a timing issue in the test.
            kihwal Kihwal Lee added a comment - Test failures are not related. No new test case added since it only changes the sleep time to be configurable. It's hard to add a meaningful test for this. TestRetryCacheWithHA - HDFS-7524 TestFailureToReadEdits#testFailureToReadEditsOnTransitionToActive[0] - After the test case timed out on a new checkpoint, the mini dfs cluster shutdown. But the standby was actually done with checkpointing and uploading the image at that point. The SBN said "Exception during image upload..." on shutdown. So it looks like a timing issue in the test.
            daryn Daryn Sharp added a comment -

            +1 Simple change, has worked fine under production load.

            daryn Daryn Sharp added a comment - +1 Simple change, has worked fine under production load.
            kihwal Kihwal Lee added a comment -

            Thanks for the review, Daryn. I've committed this to trunk and branch-2.

            kihwal Kihwal Lee added a comment - Thanks for the review, Daryn. I've committed this to trunk and branch-2.
            hudson Hudson added a comment -

            FAILURE: Integrated in Hadoop-trunk-Commit #7537 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7537/)
            HDFS-8046. Allow better control of getContentSummary. Contributed by Kihwal Lee. (kihwal: rev 285b31e75e51ec8e3a796c2cb0208739368ca9b8)

            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ContentSummaryComputationContext.java
            • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
            hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #7537 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7537/ ) HDFS-8046 . Allow better control of getContentSummary. Contributed by Kihwal Lee. (kihwal: rev 285b31e75e51ec8e3a796c2cb0208739368ca9b8) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ContentSummaryComputationContext.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
            hudson Hudson added a comment -

            FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #158 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/158/)
            HDFS-8046. Allow better control of getContentSummary. Contributed by Kihwal Lee. (kihwal: rev 285b31e75e51ec8e3a796c2cb0208739368ca9b8)

            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ContentSummaryComputationContext.java
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
            • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
            hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #158 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/158/ ) HDFS-8046 . Allow better control of getContentSummary. Contributed by Kihwal Lee. (kihwal: rev 285b31e75e51ec8e3a796c2cb0208739368ca9b8) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ContentSummaryComputationContext.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
            hudson Hudson added a comment -

            FAILURE: Integrated in Hadoop-Hdfs-trunk #2090 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2090/)
            HDFS-8046. Allow better control of getContentSummary. Contributed by Kihwal Lee. (kihwal: rev 285b31e75e51ec8e3a796c2cb0208739368ca9b8)

            • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ContentSummaryComputationContext.java
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
            hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2090 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2090/ ) HDFS-8046 . Allow better control of getContentSummary. Contributed by Kihwal Lee. (kihwal: rev 285b31e75e51ec8e3a796c2cb0208739368ca9b8) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ContentSummaryComputationContext.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
            hudson Hudson added a comment -

            FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #149 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/149/)
            HDFS-8046. Allow better control of getContentSummary. Contributed by Kihwal Lee. (kihwal: rev 285b31e75e51ec8e3a796c2cb0208739368ca9b8)

            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ContentSummaryComputationContext.java
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java
            • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
            hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #149 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/149/ ) HDFS-8046 . Allow better control of getContentSummary. Contributed by Kihwal Lee. (kihwal: rev 285b31e75e51ec8e3a796c2cb0208739368ca9b8) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ContentSummaryComputationContext.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
            hudson Hudson added a comment -

            SUCCESS: Integrated in Hadoop-Yarn-trunk #892 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/892/)
            HDFS-8046. Allow better control of getContentSummary. Contributed by Kihwal Lee. (kihwal: rev 285b31e75e51ec8e3a796c2cb0208739368ca9b8)

            • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ContentSummaryComputationContext.java
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java
            hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk #892 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/892/ ) HDFS-8046 . Allow better control of getContentSummary. Contributed by Kihwal Lee. (kihwal: rev 285b31e75e51ec8e3a796c2cb0208739368ca9b8) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ContentSummaryComputationContext.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java
            hudson Hudson added a comment -

            FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #159 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/159/)
            HDFS-8046. Allow better control of getContentSummary. Contributed by Kihwal Lee. (kihwal: rev 285b31e75e51ec8e3a796c2cb0208739368ca9b8)

            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
            • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ContentSummaryComputationContext.java
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java
            hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #159 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/159/ ) HDFS-8046 . Allow better control of getContentSummary. Contributed by Kihwal Lee. (kihwal: rev 285b31e75e51ec8e3a796c2cb0208739368ca9b8) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ContentSummaryComputationContext.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java
            hudson Hudson added a comment -

            FAILURE: Integrated in Hadoop-Mapreduce-trunk #2108 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2108/)
            HDFS-8046. Allow better control of getContentSummary. Contributed by Kihwal Lee. (kihwal: rev 285b31e75e51ec8e3a796c2cb0208739368ca9b8)

            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java
            • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ContentSummaryComputationContext.java
            • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
            hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2108 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2108/ ) HDFS-8046 . Allow better control of getContentSummary. Contributed by Kihwal Lee. (kihwal: rev 285b31e75e51ec8e3a796c2cb0208739368ca9b8) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ContentSummaryComputationContext.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt

            sjlee0 backported this to 2.6.1. The patch applies okay for the most part except for the changes in FSDirStatAndListingOp.java which doesn't exit in 2.6.

            I just pushed the commit to 2.6.1 after running compilation.

            vinodkv Vinod Kumar Vavilapalli added a comment - sjlee0 backported this to 2.6.1. The patch applies okay for the most part except for the changes in FSDirStatAndListingOp.java which doesn't exit in 2.6. I just pushed the commit to 2.6.1 after running compilation.

            Attaching the patch committed to branch-2.6.1.

            vinodkv Vinod Kumar Vavilapalli added a comment - Attaching the patch committed to branch-2.6.1.

            Just pulled this into branch-2.7 (release 2.7.2) as it already exists in 2.6.1.

            Branch-2 patch applies cleanly. Ran compilation before the push.

            vinodkv Vinod Kumar Vavilapalli added a comment - Just pulled this into branch-2.7 (release 2.7.2) as it already exists in 2.6.1. Branch-2 patch applies cleanly. Ran compilation before the push.
            xiaochen Xiao Chen added a comment -

            Thanks kihwal and all for the contribution. Great work here and on HDFS-4995 to fix NN locking issues!

            Just dropping a quick note here, that increasing the default of DFS_CONTENT_SUMMARY_LIMIT_DEFAULT from 0 to 5000, plus a bug in HDFS-4995 (fixed by HDFS-8581) created a somewhat incompatible behavior on /: hdfs dfs -du -s / or hdfs dfs -count / may end up only calculate some of the dirs when the limit is reached.

            I think we should backport HDFS-8581 to 2.6.x and 2.7.x to fix this.

            xiaochen Xiao Chen added a comment - Thanks kihwal and all for the contribution. Great work here and on HDFS-4995 to fix NN locking issues! Just dropping a quick note here, that increasing the default of DFS_CONTENT_SUMMARY_LIMIT_DEFAULT from 0 to 5000, plus a bug in HDFS-4995 (fixed by HDFS-8581 ) created a somewhat incompatible behavior on / : hdfs dfs -du -s / or hdfs dfs -count / may end up only calculate some of the dirs when the limit is reached. I think we should backport HDFS-8581 to 2.6.x and 2.7.x to fix this.
            kihwal Kihwal Lee added a comment -

            xiaochen, thanks for letting me know. I've cherry-picked the fix to 2.6 and 2.7.

            kihwal Kihwal Lee added a comment - xiaochen , thanks for letting me know. I've cherry-picked the fix to 2.6 and 2.7.
            xiaochen Xiao Chen added a comment -

            Thanks for the quick action Kihwal!

            xiaochen Xiao Chen added a comment - Thanks for the quick action Kihwal!

            People

              kihwal Kihwal Lee
              kihwal Kihwal Lee
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: