Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12082

BlockInvalidateLimit value is incorrectly set after namenode heartbeat interval reconfigured

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.9.0, 3.0.0-beta1
    • Component/s: hdfs, namenode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      HDFS-1477 provides an option to reconfigured namenode heartbeat interval without restarting the namenode. When the heartbeat interval is reconfigured, blockInvalidateLimit gets recounted

       this.blockInvalidateLimit = Math.max(20 * (int) (intervalSeconds),
              DFSConfigKeys.DFS_BLOCK_INVALIDATE_LIMIT_DEFAULT);
      

      this doesn't honor the existing value set by dfs.block.invalidate.limit.

      1. HDFS-12082.001.patch
        4 kB
        Weiwei Yang
      2. HDFS-12082.002.patch
        4 kB
        Weiwei Yang
      3. HDFS-12082.003.patch
        5 kB
        Weiwei Yang
      4. HDFS-12082.004.patch
        5 kB
        Weiwei Yang

        Issue Links

          Activity

          Hide
          cheersyang Weiwei Yang added a comment -

          Proposed a patch to fix this issue. Otherwise if user reconfigures namenode interval, the value of property dfs.block.invalidate.limit will be always overwritten.

          The fix simply honors the configuration of dfs.block.invalidate.limit and use it for block invalidate limit, and this will not change when heartbeat interval changes. Reason is, following formula doesn't really work

          // Default heartbeat is 3s, unless heartbeat is set to bigger than 50s,
          // it is always 1000
          (1) final int blockInvalidateLimit = Math.max(20*(int)(heartbeatIntervalSeconds),
                    DFSConfigKeys.DFS_BLOCK_INVALIDATE_LIMIT_DEFAULT);
          
          // We will not reach the default value here, because we always load defaults
          // from hdfs-default.xml. If the property is not found in hdfs-site.xml,
          // it simply returns the default value from hdfs-default.xml, which is 1000.
          // Even blockInvalidateLimit is something else, it doesn't count.
          (2) this.blockInvalidateLimit = conf.getInt(
                    DFSConfigKeys.DFS_BLOCK_INVALIDATE_LIMIT_KEY, blockInvalidateLimit);
          

          so right now there are two cases

          1. dfs.block.invalidate.limit not set explicitly, blockInvalidateLimit=1000
          2. dfs.block.invalidate.limit is explicitly set, blockInvalidateLimit=<value_of_the_property>

          in this case, why we still need (1) ? I think we can remove it.

          Show
          cheersyang Weiwei Yang added a comment - Proposed a patch to fix this issue. Otherwise if user reconfigures namenode interval, the value of property dfs.block.invalidate.limit will be always overwritten. The fix simply honors the configuration of dfs.block.invalidate.limit and use it for block invalidate limit, and this will not change when heartbeat interval changes. Reason is, following formula doesn't really work // Default heartbeat is 3s, unless heartbeat is set to bigger than 50s, // it is always 1000 (1) final int blockInvalidateLimit = Math .max(20*( int )(heartbeatIntervalSeconds), DFSConfigKeys.DFS_BLOCK_INVALIDATE_LIMIT_DEFAULT); // We will not reach the default value here, because we always load defaults // from hdfs- default .xml. If the property is not found in hdfs-site.xml, // it simply returns the default value from hdfs- default .xml, which is 1000. // Even blockInvalidateLimit is something else , it doesn't count. (2) this .blockInvalidateLimit = conf.getInt( DFSConfigKeys.DFS_BLOCK_INVALIDATE_LIMIT_KEY, blockInvalidateLimit); so right now there are two cases dfs.block.invalidate.limit not set explicitly, blockInvalidateLimit=1000 dfs.block.invalidate.limit is explicitly set, blockInvalidateLimit=<value_of_the_property> in this case, why we still need (1) ? I think we can remove it.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 18s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 15m 53s trunk passed
          +1 compile 0m 51s trunk passed
          +1 checkstyle 0m 37s trunk passed
          +1 mvnsite 0m 53s trunk passed
          -1 findbugs 1m 39s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings.
          +1 javadoc 0m 41s trunk passed
          +1 mvninstall 0m 48s the patch passed
          +1 compile 0m 44s the patch passed
          +1 javac 0m 44s the patch passed
          -0 checkstyle 0m 33s hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 41 unchanged - 2 fixed = 42 total (was 43)
          +1 mvnsite 0m 49s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 42s the patch passed
          +1 javadoc 0m 36s the patch passed
          -1 unit 68m 1s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 23s The patch does not generate ASF License warnings.
          95m 48s



          Reason Tests
          Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070
            hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation
            hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-12082
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12875639/HDFS-12082.001.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux f3b258454320 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / b17e655
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/20153/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/20153/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/20153/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20153/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20153/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 18s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 15m 53s trunk passed +1 compile 0m 51s trunk passed +1 checkstyle 0m 37s trunk passed +1 mvnsite 0m 53s trunk passed -1 findbugs 1m 39s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. +1 javadoc 0m 41s trunk passed +1 mvninstall 0m 48s the patch passed +1 compile 0m 44s the patch passed +1 javac 0m 44s the patch passed -0 checkstyle 0m 33s hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 41 unchanged - 2 fixed = 42 total (was 43) +1 mvnsite 0m 49s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 42s the patch passed +1 javadoc 0m 36s the patch passed -1 unit 68m 1s hadoop-hdfs in the patch failed. +1 asflicense 0m 23s The patch does not generate ASF License warnings. 95m 48s Reason Tests Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070   hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation   hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-12082 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12875639/HDFS-12082.001.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux f3b258454320 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / b17e655 Default Java 1.8.0_131 findbugs v3.1.0-RC1 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/20153/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/20153/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/20153/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20153/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20153/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 12s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 13m 52s trunk passed
          +1 compile 0m 47s trunk passed
          +1 checkstyle 0m 36s trunk passed
          +1 mvnsite 0m 53s trunk passed
          -1 findbugs 1m 39s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings.
          +1 javadoc 0m 39s trunk passed
          +1 mvninstall 0m 48s the patch passed
          +1 compile 0m 45s the patch passed
          +1 javac 0m 45s the patch passed
          +1 checkstyle 0m 33s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 41 unchanged - 2 fixed = 41 total (was 43)
          +1 mvnsite 0m 50s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 46s the patch passed
          +1 javadoc 0m 38s the patch passed
          -1 unit 66m 47s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 24s The patch does not generate ASF License warnings.
          92m 25s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks
            hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics
            hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-12082
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12875680/HDFS-12082.002.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 00247461dc88 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / b17e655
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/20155/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/20155/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20155/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20155/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 12s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 13m 52s trunk passed +1 compile 0m 47s trunk passed +1 checkstyle 0m 36s trunk passed +1 mvnsite 0m 53s trunk passed -1 findbugs 1m 39s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. +1 javadoc 0m 39s trunk passed +1 mvninstall 0m 48s the patch passed +1 compile 0m 45s the patch passed +1 javac 0m 45s the patch passed +1 checkstyle 0m 33s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 41 unchanged - 2 fixed = 41 total (was 43) +1 mvnsite 0m 50s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 46s the patch passed +1 javadoc 0m 38s the patch passed -1 unit 66m 47s hadoop-hdfs in the patch failed. +1 asflicense 0m 24s The patch does not generate ASF License warnings. 92m 25s Reason Tests Failed junit tests hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks   hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-12082 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12875680/HDFS-12082.002.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 00247461dc88 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / b17e655 Default Java 1.8.0_131 findbugs v3.1.0-RC1 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/20155/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html unit https://builds.apache.org/job/PreCommit-HDFS-Build/20155/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20155/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20155/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          vagarychen Chen Liang added a comment -

          Thansk Weiwei Yang for reporting this!

          I'm a little confused about the patch though. When reading the description, I was thinking the change is probably that, when setHeartbeatInterval is called, instead of
          blockInvalidateLimit = Math.max(20 * (int) (intervalSeconds), DFSConfigKeys.DFS_BLOCK_INVALIDATE_LIMIT_DEFAULT);
          we change it to something like
          blockInvalidateLimit = Math.max(20 * (int) (intervalSeconds), configuredLimit);
          where final int configuredLimit = conf.getInt(DFSConfigKeys.DFS_BLOCK_INVALIDATE_LIMIT_KEY, DFSConfigKeys.DFS_BLOCK_INVALIDATE_LIMIT_DEFAULT);

          But seems the patch removed this part completely. It seems to me in this case blockInvalidateLimit will be set to configured value at the start once and will no longer change when setHeartbeatInterval gets called. Is this the desired behaviour? because the original code seems to have the syntax guarantee that no matter how setHeartbeatInterval gets called, blockInvalidateLimit will never be larger than 20x intervalSeconds, and it appears that this will not be guaranteed with the patch.

          An addition minor comment, in the unit test, how about changing "" + 6 to Integer.toString(6)?

          Show
          vagarychen Chen Liang added a comment - Thansk Weiwei Yang for reporting this! I'm a little confused about the patch though. When reading the description, I was thinking the change is probably that, when setHeartbeatInterval is called, instead of blockInvalidateLimit = Math.max(20 * (int) (intervalSeconds), DFSConfigKeys.DFS_BLOCK_INVALIDATE_LIMIT_DEFAULT); we change it to something like blockInvalidateLimit = Math.max(20 * (int) (intervalSeconds), configuredLimit); where final int configuredLimit = conf.getInt(DFSConfigKeys.DFS_BLOCK_INVALIDATE_LIMIT_KEY, DFSConfigKeys.DFS_BLOCK_INVALIDATE_LIMIT_DEFAULT); But seems the patch removed this part completely. It seems to me in this case blockInvalidateLimit will be set to configured value at the start once and will no longer change when setHeartbeatInterval gets called. Is this the desired behaviour? because the original code seems to have the syntax guarantee that no matter how setHeartbeatInterval gets called, blockInvalidateLimit will never be larger than 20x intervalSeconds , and it appears that this will not be guaranteed with the patch. An addition minor comment, in the unit test, how about changing "" + 6 to Integer.toString(6) ?
          Hide
          cheersyang Weiwei Yang added a comment -

          Hi Chen Liang

          Thanks for helping to review this. You are making a good point. Second thought, I think it is better to ensure the effected invalidate block limit is the bigger one of configured value in hdfs-site.xml and 20*HB_interval. This will ensure we don't throttle the block deletion too much on datanodes. I have revised the patch to do so. Please let me know if v3 patch makes sense to you. Thanks.

          Show
          cheersyang Weiwei Yang added a comment - Hi Chen Liang Thanks for helping to review this. You are making a good point. Second thought, I think it is better to ensure the effected invalidate block limit is the bigger one of configured value in hdfs-site.xml and 20*HB_interval. This will ensure we don't throttle the block deletion too much on datanodes. I have revised the patch to do so. Please let me know if v3 patch makes sense to you. Thanks.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 9s Docker mode activated.
                Prechecks
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
                trunk Compile Tests
          +1 mvninstall 13m 7s trunk passed
          +1 compile 0m 48s trunk passed
          +1 checkstyle 0m 35s trunk passed
          +1 mvnsite 0m 54s trunk passed
          -1 findbugs 1m 37s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings.
          +1 javadoc 0m 40s trunk passed
                Patch Compile Tests
          +1 mvninstall 0m 47s the patch passed
          +1 compile 0m 45s the patch passed
          +1 javac 0m 45s the patch passed
          +1 checkstyle 0m 33s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 41 unchanged - 2 fixed = 41 total (was 43)
          +1 mvnsite 0m 51s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 43s the patch passed
          +1 javadoc 0m 38s the patch passed
                Other Tests
          -1 unit 63m 37s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 19s The patch does not generate ASF License warnings.
          88m 18s



          Reason Tests
          Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-12082
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12876169/HDFS-12082.003.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux e7abd3ea0994 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / f484a6f
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/20197/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/20197/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20197/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20197/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 9s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.       trunk Compile Tests +1 mvninstall 13m 7s trunk passed +1 compile 0m 48s trunk passed +1 checkstyle 0m 35s trunk passed +1 mvnsite 0m 54s trunk passed -1 findbugs 1m 37s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. +1 javadoc 0m 40s trunk passed       Patch Compile Tests +1 mvninstall 0m 47s the patch passed +1 compile 0m 45s the patch passed +1 javac 0m 45s the patch passed +1 checkstyle 0m 33s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 41 unchanged - 2 fixed = 41 total (was 43) +1 mvnsite 0m 51s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 43s the patch passed +1 javadoc 0m 38s the patch passed       Other Tests -1 unit 63m 37s hadoop-hdfs in the patch failed. +1 asflicense 0m 19s The patch does not generate ASF License warnings. 88m 18s Reason Tests Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-12082 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12876169/HDFS-12082.003.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux e7abd3ea0994 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / f484a6f Default Java 1.8.0_131 findbugs v3.1.0-RC1 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/20197/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html unit https://builds.apache.org/job/PreCommit-HDFS-Build/20197/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20197/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20197/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          vagarychen Chen Liang added a comment -

          Thanks Weiwei Yang for the update! +1 on v003 patch

          Show
          vagarychen Chen Liang added a comment - Thanks Weiwei Yang for the update! +1 on v003 patch
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          Hi Weiwei Yang, thanks for reporting this and working on the fix.

          There is a change in the default behavior on startup which looks unnecessary.

          The previous formula for computing the limit on process startup was:

          1. Use dfs.block.invalidate.limit if configured.
          2. Else, use the max(20 * heartbeatInterval, DFS_BLOCK_INVALIDATE_LIMIT_DEFAULT).

          With your patch this has effectively changed to:

          1. Use the max(20 * heartbeatInterval, dfs.block.invalidate.limit) if dfs.block.invalidate.limit is configured.
          2. Else, use 20*heartbeatInterval.
          Show
          arpitagarwal Arpit Agarwal added a comment - Hi Weiwei Yang , thanks for reporting this and working on the fix. There is a change in the default behavior on startup which looks unnecessary. The previous formula for computing the limit on process startup was: Use dfs.block.invalidate.limit if configured. Else, use the max(20 * heartbeatInterval, DFS_BLOCK_INVALIDATE_LIMIT_DEFAULT). With your patch this has effectively changed to: Use the max(20 * heartbeatInterval, dfs.block.invalidate.limit) if dfs.block.invalidate.limit is configured. Else, use 20*heartbeatInterval.
          Hide
          cheersyang Weiwei Yang added a comment - - edited

          Hi Arpit Agarwal

          Thanks for looking at this issue. This issue is a bit complex than it looks to be, please allow me to explain.

          Basically this is because following code isn't exactly working as expected,

          this.blockInvalidateLimit = conf.getInt(
                  DFSConfigKeys.DFS_BLOCK_INVALIDATE_LIMIT_KEY, blockInvalidateLimit);
          

          expected:

          1. Use the value of dfs.block.invalidate.limit from hdfs-site.xml if set
          2. If dfs.block.invalidate.limit is not set in hdfs-site.xml, use max(20 * HBInterval, 1000)
            (like you mentioned)

          however in actual, it behaves like following

          1. Use the value of dfs.block.invalidate.limit from hdfs-site.xml if set
          2. Use the value of dfs.block.invalidate.limit in hdfs-default.xml if it is not set in hdfs-site.xml

          it will NEVER return the default value given by argument blockInvalidateLimit in cluster env, because we always ship hdfs-default.xml in hdfs jar file, that contains property dfs.block.invalidate.limit=1000.

          The logic in my patch is (please check v4 patch, there was 1 line error in v3 patch)

          Find the bigger value from configuration files (first hdfs-site.xml then back off to hdfs-default.xml) and compare it with 20*HB_interval, use the bigger one as the effective value for the invalidate limit. This will ensure that user won't throttle the block deletion too much for datanodes (even after HB interval is reconfigured). For example, if HB is 60s, we don't want to let user to set the limit to less than 1200, otherwise the block deletion will be too slow.

          It might be possible to fix this in another way round, by respecting to its "original" idea, but that will need to add a method in Configuration class to tell if a property is configured by user (use getPropertySources?). A bit over complex?

          Please let me know your thought.

          Thanks

          Show
          cheersyang Weiwei Yang added a comment - - edited Hi Arpit Agarwal Thanks for looking at this issue. This issue is a bit complex than it looks to be, please allow me to explain. Basically this is because following code isn't exactly working as expected, this .blockInvalidateLimit = conf.getInt( DFSConfigKeys.DFS_BLOCK_INVALIDATE_LIMIT_KEY, blockInvalidateLimit); expected : Use the value of dfs.block.invalidate.limit from hdfs-site.xml if set If dfs.block.invalidate.limit is not set in hdfs-site.xml, use max(20 * HBInterval, 1000) (like you mentioned) however in actual , it behaves like following Use the value of dfs.block.invalidate.limit from hdfs-site.xml if set Use the value of dfs.block.invalidate.limit in hdfs-default.xml if it is not set in hdfs-site.xml it will NEVER return the default value given by argument blockInvalidateLimit in cluster env, because we always ship hdfs-default.xml in hdfs jar file, that contains property dfs.block.invalidate.limit=1000 . The logic in my patch is (please check v4 patch, there was 1 line error in v3 patch) Find the bigger value from configuration files (first hdfs-site.xml then back off to hdfs-default.xml) and compare it with 20*HB_interval, use the bigger one as the effective value for the invalidate limit. This will ensure that user won't throttle the block deletion too much for datanodes (even after HB interval is reconfigured). For example, if HB is 60s, we don't want to let user to set the limit to less than 1200, otherwise the block deletion will be too slow. It might be possible to fix this in another way round, by respecting to its "original" idea, but that will need to add a method in Configuration class to tell if a property is configured by user (use getPropertySources?). A bit over complex? Please let me know your thought. Thanks
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 17s Docker mode activated.
                Prechecks
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
                trunk Compile Tests
          +1 mvninstall 15m 1s trunk passed
          +1 compile 0m 54s trunk passed
          +1 checkstyle 0m 38s trunk passed
          +1 mvnsite 1m 2s trunk passed
          -1 findbugs 1m 49s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings.
          +1 javadoc 0m 42s trunk passed
                Patch Compile Tests
          +1 mvninstall 0m 57s the patch passed
          +1 compile 0m 53s the patch passed
          +1 javac 0m 53s the patch passed
          +1 checkstyle 0m 36s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 41 unchanged - 2 fixed = 41 total (was 43)
          +1 mvnsite 0m 57s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 55s the patch passed
          +1 javadoc 0m 42s the patch passed
                Other Tests
          -1 unit 67m 58s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 15s The patch does not generate ASF License warnings.
          95m 57s



          Reason Tests
          Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150
            hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks
            hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010
            hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy
            hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-12082
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12879598/HDFS-12082.004.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 43238c93cee9 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 0fd6d0f
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/20496/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/20496/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20496/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20496/console
          Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 17s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.       trunk Compile Tests +1 mvninstall 15m 1s trunk passed +1 compile 0m 54s trunk passed +1 checkstyle 0m 38s trunk passed +1 mvnsite 1m 2s trunk passed -1 findbugs 1m 49s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. +1 javadoc 0m 42s trunk passed       Patch Compile Tests +1 mvninstall 0m 57s the patch passed +1 compile 0m 53s the patch passed +1 javac 0m 53s the patch passed +1 checkstyle 0m 36s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 41 unchanged - 2 fixed = 41 total (was 43) +1 mvnsite 0m 57s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 55s the patch passed +1 javadoc 0m 42s the patch passed       Other Tests -1 unit 67m 58s hadoop-hdfs in the patch failed. +1 asflicense 0m 15s The patch does not generate ASF License warnings. 95m 57s Reason Tests Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150   hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010   hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-12082 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12879598/HDFS-12082.004.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 43238c93cee9 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 0fd6d0f Default Java 1.8.0_131 findbugs v3.1.0-RC1 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/20496/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html unit https://builds.apache.org/job/PreCommit-HDFS-Build/20496/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20496/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20496/console Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          Thanks Weiwei Yang. The v4 patch with the fix lgtm. Also thanks for adding a unit test.

          TestUnderReplicatedBlocks passed locally. I will commit your patch.

          Show
          arpitagarwal Arpit Agarwal added a comment - Thanks Weiwei Yang . The v4 patch with the fix lgtm. Also thanks for adding a unit test. TestUnderReplicatedBlocks passed locally. I will commit your patch.
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          I've committed this. Thanks for the contribution Weiwei Yang and thanks for the code review Chen Liang.

          Show
          arpitagarwal Arpit Agarwal added a comment - I've committed this. Thanks for the contribution Weiwei Yang and thanks for the code review Chen Liang .
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12079 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12079/)
          HDFS-12082. BlockInvalidateLimit value is incorrectly set after namenode (arp: rev 3e23415a92d43ce8818124f0b180227a52a33eaf)

          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeReconfigure.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12079 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12079/ ) HDFS-12082 . BlockInvalidateLimit value is incorrectly set after namenode (arp: rev 3e23415a92d43ce8818124f0b180227a52a33eaf) (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeReconfigure.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
          Hide
          cheersyang Weiwei Yang added a comment -

          Thanks Arpit Agarwal for the help .

          Show
          cheersyang Weiwei Yang added a comment - Thanks Arpit Agarwal for the help .

            People

            • Assignee:
              cheersyang Weiwei Yang
              Reporter:
              cheersyang Weiwei Yang
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development