Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12140

Remove BPOfferService lock contention to get block pool id

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.8.0
    • Fix Version/s: 2.9.0, 3.0.0-beta1, 2.8.2
    • Component/s: datanode
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      The block pool id is protected by a lock in BPOfferService. This creates excessive contention especially for xceivers threads attempting to queue IBRs and heartbeat processing. When the latter is delayed due to excessive FSDataset lock contention, it causes pipelines to collapse.

      Accessing the block pool id should be lockless after registration.

      1. HDFS-12140.branch-2.8.patch
        5 kB
        Daryn Sharp
      2. HDFS-12140.trunk.patch
        5 kB
        Daryn Sharp

        Activity

        Hide
        daryn Daryn Sharp added a comment -

        Simply sets/clears a volatile for the bp id when the ns info is updated. Retains the blocking behavior prior to setting the ns info, but lockless access thereafter.

        Will post branch-2 patch shortly. Minor change needed for the absent "quiet" boolean.

        Show
        daryn Daryn Sharp added a comment - Simply sets/clears a volatile for the bp id when the ns info is updated. Retains the blocking behavior prior to setting the ns info, but lockless access thereafter. Will post branch-2 patch shortly. Minor change needed for the absent "quiet" boolean.
        Hide
        daryn Daryn Sharp added a comment -

        Also avoids a silly double fetch/penalty for a precondition's error message!

        Show
        daryn Daryn Sharp added a comment - Also avoids a silly double fetch/penalty for a precondition's error message!
        Hide
        daryn Daryn Sharp added a comment -

        Trunk applies to branch-2 as well. Just need 2.8 change.

        Show
        daryn Daryn Sharp added a comment - Trunk applies to branch-2 as well. Just need 2.8 change.
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 22s Docker mode activated.
              Prechecks
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
              branch-2.8 Compile Tests
        +1 mvninstall 9m 41s branch-2.8 passed
        +1 compile 0m 40s branch-2.8 passed with JDK v1.8.0_131
        +1 compile 0m 42s branch-2.8 passed with JDK v1.7.0_131
        +1 checkstyle 0m 22s branch-2.8 passed
        +1 mvnsite 0m 59s branch-2.8 passed
        +1 findbugs 2m 2s branch-2.8 passed
        +1 javadoc 0m 41s branch-2.8 passed with JDK v1.8.0_131
        +1 javadoc 1m 1s branch-2.8 passed with JDK v1.7.0_131
              Patch Compile Tests
        +1 mvninstall 0m 44s the patch passed
        +1 compile 0m 38s the patch passed with JDK v1.8.0_131
        +1 javac 0m 38s the patch passed
        +1 compile 0m 42s the patch passed with JDK v1.7.0_131
        +1 javac 0m 42s the patch passed
        -0 checkstyle 0m 18s hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 20 unchanged - 0 fixed = 21 total (was 20)
        +1 mvnsite 0m 49s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 2m 7s the patch passed
        +1 javadoc 0m 35s the patch passed with JDK v1.8.0_131
        +1 javadoc 0m 58s the patch passed with JDK v1.7.0_131
              Other Tests
        -1 unit 62m 57s hadoop-hdfs in the patch failed with JDK v1.7.0_131.
        +1 asflicense 0m 25s The patch does not generate ASF License warnings.
        149m 59s



        Reason Tests
        JDK v1.7.0_131 Failed junit tests hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:d946387
        JIRA Issue HDFS-12140
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12877322/HDFS-12140.branch-2.8.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 57d424cc22d2 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision branch-2.8 / d6228fb
        Default Java 1.7.0_131
        Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_131 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_131
        findbugs v3.0.0
        checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/20273/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
        unit https://builds.apache.org/job/PreCommit-HDFS-Build/20273/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_131.txt
        JDK v1.7.0_131 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20273/testReport/
        modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
        Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20273/console
        Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 22s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.       branch-2.8 Compile Tests +1 mvninstall 9m 41s branch-2.8 passed +1 compile 0m 40s branch-2.8 passed with JDK v1.8.0_131 +1 compile 0m 42s branch-2.8 passed with JDK v1.7.0_131 +1 checkstyle 0m 22s branch-2.8 passed +1 mvnsite 0m 59s branch-2.8 passed +1 findbugs 2m 2s branch-2.8 passed +1 javadoc 0m 41s branch-2.8 passed with JDK v1.8.0_131 +1 javadoc 1m 1s branch-2.8 passed with JDK v1.7.0_131       Patch Compile Tests +1 mvninstall 0m 44s the patch passed +1 compile 0m 38s the patch passed with JDK v1.8.0_131 +1 javac 0m 38s the patch passed +1 compile 0m 42s the patch passed with JDK v1.7.0_131 +1 javac 0m 42s the patch passed -0 checkstyle 0m 18s hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 20 unchanged - 0 fixed = 21 total (was 20) +1 mvnsite 0m 49s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 7s the patch passed +1 javadoc 0m 35s the patch passed with JDK v1.8.0_131 +1 javadoc 0m 58s the patch passed with JDK v1.7.0_131       Other Tests -1 unit 62m 57s hadoop-hdfs in the patch failed with JDK v1.7.0_131. +1 asflicense 0m 25s The patch does not generate ASF License warnings. 149m 59s Reason Tests JDK v1.7.0_131 Failed junit tests hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot Subsystem Report/Notes Docker Image:yetus/hadoop:d946387 JIRA Issue HDFS-12140 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12877322/HDFS-12140.branch-2.8.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 57d424cc22d2 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision branch-2.8 / d6228fb Default Java 1.7.0_131 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_131 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_131 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/20273/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/20273/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_131.txt JDK v1.7.0_131 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20273/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20273/console Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        shahrs87 Rushabh S Shah added a comment -

        Daryn Sharp: thanks for the patch.
        Precommit ran on only branch-2.8 version of patch and not on trunk.
        Please re-attach trunk patch again.

        I did a code review also.
        Overall the patch looks good.
        I just have one concern in test case.
        You have used mockito.reset few times.
        In mockito documentation, it has been mentioned that you shouldn't use reset method. Instead consider simple and small tests.
        But you are testing only small portion of code, so I am +1 to patch.

        +1 (non-binding)

        Show
        shahrs87 Rushabh S Shah added a comment - Daryn Sharp : thanks for the patch. Precommit ran on only branch-2.8 version of patch and not on trunk. Please re-attach trunk patch again. I did a code review also. Overall the patch looks good. I just have one concern in test case. You have used mockito.reset few times. In mockito documentation, it has been mentioned that you shouldn't use reset method. Instead consider simple and small tests. But you are testing only small portion of code, so I am +1 to patch. +1 (non-binding)
        Hide
        kihwal Kihwal Lee added a comment -

        +1. For a BPOS, the blockpool ID won't change once registers.

        Show
        kihwal Kihwal Lee added a comment - +1. For a BPOS, the blockpool ID won't change once registers.
        Hide
        kihwal Kihwal Lee added a comment -

        Thansk for the patch, Daryn. And thanks for reviewing it, Rushabh. I've committed this to trunk, branch-2 and branch-2.8.

        Show
        kihwal Kihwal Lee added a comment - Thansk for the patch, Daryn. And thanks for reviewing it, Rushabh. I've committed this to trunk, branch-2 and branch-2.8.
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12011 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12011/)
        HDFS-12140. Remove BPOfferService lock contention to get block pool id. (kihwal: rev e7d187a1b6a826edd5bd0f708184d48f3674d489)

        • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBPOfferService.java
        • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12011 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12011/ ) HDFS-12140 . Remove BPOfferService lock contention to get block pool id. (kihwal: rev e7d187a1b6a826edd5bd0f708184d48f3674d489) (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBPOfferService.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java

          People

          • Assignee:
            daryn Daryn Sharp
            Reporter:
            daryn Daryn Sharp
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development