Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-6682

Add a metric to expose the timestamp of the oldest under-replicated block

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Not A Problem
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      In the following case, the data in the HDFS is lost and a client needs to put the same file again.

      1. A Client puts a file to HDFS
      2. A DataNode crashes before replicating a block of the file to other DataNodes

      I propose a metric to expose the timestamp of the oldest under-replicated/corrupt block. That way client can know what file to retain for the re-try.

      1. HDFS-6682.002.patch
        10 kB
        Akira Ajisaka
      2. HDFS-6682.003.patch
        11 kB
        Akira Ajisaka
      3. HDFS-6682.004.patch
        9 kB
        Akira Ajisaka
      4. HDFS-6682.005.patch
        10 kB
        Akira Ajisaka
      5. HDFS-6682.006.patch
        10 kB
        Akira Ajisaka
      6. HDFS-6682.patch
        9 kB
        Akira Ajisaka

        Issue Links

          Activity

          Hide
          Akira Ajisaka added a comment -

          Attached a patch to expose 'TimeOfTheOldestBlockToBeReplicated' metric. The metric shows timestamp of the oldest under-replicated/corrupt block.
          I built with the patch and verified the metric was obtained via JMX and FileSink.

          Show
          Akira Ajisaka added a comment - Attached a patch to expose 'TimeOfTheOldestBlockToBeReplicated' metric. The metric shows timestamp of the oldest under-replicated/corrupt block. I built with the patch and verified the metric was obtained via JMX and FileSink.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12656472/HDFS-6682.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.ipc.TestIPC
          org.apache.hadoop.fs.TestSymlinkLocalFSFileSystem
          org.apache.hadoop.fs.TestSymlinkLocalFSFileContext

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7383//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7383//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12656472/HDFS-6682.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.ipc.TestIPC org.apache.hadoop.fs.TestSymlinkLocalFSFileSystem org.apache.hadoop.fs.TestSymlinkLocalFSFileContext +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7383//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7383//console This message is automatically generated.
          Hide
          Akira Ajisaka added a comment -

          These failed tests are not related to the patch. They were fixed by INFRA-8097 and HADOOP-10866.

          Show
          Akira Ajisaka added a comment - These failed tests are not related to the patch. They were fixed by INFRA-8097 and HADOOP-10866 .
          Hide
          Akira Ajisaka added a comment -

          Aaron T. Myers, would you please review this patch?

          Show
          Akira Ajisaka added a comment - Aaron T. Myers , would you please review this patch?
          Hide
          Akira Ajisaka added a comment -

          Attaching v2 patch:

          • Modified HashMap to keep the oldest timestamp for getting it without sorting
          • Use block_id(long) instead of block(Block) for the key of the HashMap
          Show
          Akira Ajisaka added a comment - Attaching v2 patch: Modified HashMap to keep the oldest timestamp for getting it without sorting Use block_id(long) instead of block(Block) for the key of the HashMap
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12691106/HDFS-6682.002.patch
          against trunk revision ae91b13.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9164//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9164//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9164//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12691106/HDFS-6682.002.patch against trunk revision ae91b13. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9164//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9164//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9164//console This message is automatically generated.
          Hide
          Akira Ajisaka added a comment -

          Fixed a findbugs warning.

          Show
          Akira Ajisaka added a comment - Fixed a findbugs warning.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12691345/HDFS-6682.003.patch
          against trunk revision ae91b13.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.hdfs.server.namenode.ha.TestBootstrapStandbyWithQJM

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9165//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9165//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12691345/HDFS-6682.003.patch against trunk revision ae91b13. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestBootstrapStandbyWithQJM Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9165//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9165//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12691345/HDFS-6682.003.patch
          against trunk revision af9d4fe.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/10240//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12691345/HDFS-6682.003.patch against trunk revision af9d4fe. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/10240//console This message is automatically generated.
          Hide
          Allen Wittenauer added a comment -

          I'd still really like to have this...

          Show
          Allen Wittenauer added a comment - I'd still really like to have this...
          Hide
          Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          -1 patch 0m 1s The patch command could not apply the patch during dryrun.



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12691345/HDFS-6682.003.patch
          Optional Tests site javadoc javac unit findbugs checkstyle
          git revision trunk / 5137b38
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/11773/console

          This message was automatically generated.

          Show
          Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 patch 0m 1s The patch command could not apply the patch during dryrun. Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12691345/HDFS-6682.003.patch Optional Tests site javadoc javac unit findbugs checkstyle git revision trunk / 5137b38 Console output https://builds.apache.org/job/PreCommit-HDFS-Build/11773/console This message was automatically generated.
          Hide
          Akira Ajisaka added a comment -

          004 patch

          • Rebased for the latest trunk.
          • Use LinkedHashMap to keep the insertion order. That way we can avoid calling Collections.min to get the smallest timestamp.
          Show
          Akira Ajisaka added a comment - 004 patch Rebased for the latest trunk. Use LinkedHashMap to keep the insertion order. That way we can avoid calling Collections.min to get the smallest timestamp.
          Hide
          Akira Ajisaka added a comment -

          005 patch

          • Updated the document.
          Show
          Akira Ajisaka added a comment - 005 patch Updated the document.
          Hide
          Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 22m 59s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 7m 57s There were no new javac warning messages.
          +1 javadoc 9m 50s There were no new javadoc warning messages.
          +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings.
          +1 site 3m 9s Site still builds.
          -1 checkstyle 2m 41s The applied patch generated 4 new checkstyle issues (total was 457, now 458).
          +1 whitespace 0m 1s The patch has no lines that end in whitespace.
          +1 install 1m 24s mvn install still works.
          +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse.
          +1 findbugs 4m 41s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 common tests 22m 8s Tests passed in hadoop-common.
          -1 hdfs tests 160m 3s Tests failed in hadoop-hdfs.
              235m 52s  



          Reason Tests
          Failed unit tests hadoop.hdfs.server.namenode.ha.TestBootstrapStandby
            hadoop.hdfs.TestDistributedFileSystem



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12746488/HDFS-6682.005.patch
          Optional Tests site javadoc javac unit findbugs checkstyle
          git revision trunk / 4025326
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/11783/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
          hadoop-common test log https://builds.apache.org/job/PreCommit-HDFS-Build/11783/artifact/patchprocess/testrun_hadoop-common.txt
          hadoop-hdfs test log https://builds.apache.org/job/PreCommit-HDFS-Build/11783/artifact/patchprocess/testrun_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/11783/testReport/
          Java 1.7.0_55
          uname Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/11783/console

          This message was automatically generated.

          Show
          Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 22m 59s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 57s There were no new javac warning messages. +1 javadoc 9m 50s There were no new javadoc warning messages. +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings. +1 site 3m 9s Site still builds. -1 checkstyle 2m 41s The applied patch generated 4 new checkstyle issues (total was 457, now 458). +1 whitespace 0m 1s The patch has no lines that end in whitespace. +1 install 1m 24s mvn install still works. +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse. +1 findbugs 4m 41s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 common tests 22m 8s Tests passed in hadoop-common. -1 hdfs tests 160m 3s Tests failed in hadoop-hdfs.     235m 52s   Reason Tests Failed unit tests hadoop.hdfs.server.namenode.ha.TestBootstrapStandby   hadoop.hdfs.TestDistributedFileSystem Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12746488/HDFS-6682.005.patch Optional Tests site javadoc javac unit findbugs checkstyle git revision trunk / 4025326 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/11783/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt hadoop-common test log https://builds.apache.org/job/PreCommit-HDFS-Build/11783/artifact/patchprocess/testrun_hadoop-common.txt hadoop-hdfs test log https://builds.apache.org/job/PreCommit-HDFS-Build/11783/artifact/patchprocess/testrun_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/11783/testReport/ Java 1.7.0_55 uname Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-HDFS-Build/11783/console This message was automatically generated.
          Hide
          Akira Ajisaka added a comment -

          006 patch

          • Fixes checkstyle warnings.
          Show
          Akira Ajisaka added a comment - 006 patch Fixes checkstyle warnings.
          Hide
          Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 22m 5s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 7m 39s There were no new javac warning messages.
          +1 javadoc 9m 43s There were no new javadoc warning messages.
          +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings.
          +1 site 3m 1s Site still builds.
          -1 checkstyle 2m 28s The applied patch generated 2 new checkstyle issues (total was 457, now 456).
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 22s mvn install still works.
          +1 eclipse:eclipse 0m 32s The patch built with eclipse:eclipse.
          +1 findbugs 4m 19s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 common tests 22m 17s Tests passed in hadoop-common.
          -1 hdfs tests 160m 57s Tests failed in hadoop-hdfs.
              234m 51s  



          Reason Tests
          Failed unit tests hadoop.hdfs.TestDistributedFileSystem



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12746704/HDFS-6682.006.patch
          Optional Tests site javadoc javac unit findbugs checkstyle
          git revision trunk / ee98d63
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/11795/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
          hadoop-common test log https://builds.apache.org/job/PreCommit-HDFS-Build/11795/artifact/patchprocess/testrun_hadoop-common.txt
          hadoop-hdfs test log https://builds.apache.org/job/PreCommit-HDFS-Build/11795/artifact/patchprocess/testrun_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/11795/testReport/
          Java 1.7.0_55
          uname Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/11795/console

          This message was automatically generated.

          Show
          Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 22m 5s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 39s There were no new javac warning messages. +1 javadoc 9m 43s There were no new javadoc warning messages. +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings. +1 site 3m 1s Site still builds. -1 checkstyle 2m 28s The applied patch generated 2 new checkstyle issues (total was 457, now 456). +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 22s mvn install still works. +1 eclipse:eclipse 0m 32s The patch built with eclipse:eclipse. +1 findbugs 4m 19s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 common tests 22m 17s Tests passed in hadoop-common. -1 hdfs tests 160m 57s Tests failed in hadoop-hdfs.     234m 51s   Reason Tests Failed unit tests hadoop.hdfs.TestDistributedFileSystem Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12746704/HDFS-6682.006.patch Optional Tests site javadoc javac unit findbugs checkstyle git revision trunk / ee98d63 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/11795/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt hadoop-common test log https://builds.apache.org/job/PreCommit-HDFS-Build/11795/artifact/patchprocess/testrun_hadoop-common.txt hadoop-hdfs test log https://builds.apache.org/job/PreCommit-HDFS-Build/11795/artifact/patchprocess/testrun_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/11795/testReport/ Java 1.7.0_55 uname Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-HDFS-Build/11795/console This message was automatically generated.
          Hide
          Akira Ajisaka added a comment -

          checkstyle issues and test failure are unrelated to the patch.

          Show
          Akira Ajisaka added a comment - checkstyle issues and test failure are unrelated to the patch.
          Hide
          Allen Wittenauer added a comment -

          +1 lgtm

          Show
          Allen Wittenauer added a comment - +1 lgtm
          Hide
          Akira Ajisaka added a comment -

          Committed the latest patch to trunk and branch-2. Thanks Allen for review!

          Show
          Akira Ajisaka added a comment - Committed the latest patch to trunk and branch-2. Thanks Allen for review!
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #8210 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8210/)
          HDFS-6682. Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka) (aajisaka: rev 02c01815eca656814febcdaca6115e5f53b9c746)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java
          • hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8210 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8210/ ) HDFS-6682 . Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka) (aajisaka: rev 02c01815eca656814febcdaca6115e5f53b9c746) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #996 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/996/)
          HDFS-6682. Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka) (aajisaka: rev 02c01815eca656814febcdaca6115e5f53b9c746)

          • hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #996 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/996/ ) HDFS-6682 . Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka) (aajisaka: rev 02c01815eca656814febcdaca6115e5f53b9c746) hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #266 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/266/)
          HDFS-6682. Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka) (aajisaka: rev 02c01815eca656814febcdaca6115e5f53b9c746)

          • hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #266 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/266/ ) HDFS-6682 . Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka) (aajisaka: rev 02c01815eca656814febcdaca6115e5f53b9c746) hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2193 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2193/)
          HDFS-6682. Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka) (aajisaka: rev 02c01815eca656814febcdaca6115e5f53b9c746)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2193 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2193/ ) HDFS-6682 . Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka) (aajisaka: rev 02c01815eca656814febcdaca6115e5f53b9c746) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #255 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/255/)
          HDFS-6682. Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka) (aajisaka: rev 02c01815eca656814febcdaca6115e5f53b9c746)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #255 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/255/ ) HDFS-6682 . Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka) (aajisaka: rev 02c01815eca656814febcdaca6115e5f53b9c746) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2212 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2212/)
          HDFS-6682. Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka) (aajisaka: rev 02c01815eca656814febcdaca6115e5f53b9c746)

          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          Show
          Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2212 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2212/ ) HDFS-6682 . Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka) (aajisaka: rev 02c01815eca656814febcdaca6115e5f53b9c746) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #263 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/263/)
          HDFS-6682. Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka) (aajisaka: rev 02c01815eca656814febcdaca6115e5f53b9c746)

          • hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          Show
          Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #263 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/263/ ) HDFS-6682 . Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka) (aajisaka: rev 02c01815eca656814febcdaca6115e5f53b9c746) hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          Hide
          Yi Liu added a comment -

          -1 for the patch. Sorry I come late. Akira Ajisaka, could we revert it?

          /** Keep timestamp when a block is put into the queue. */
            private final Map<BlockInfo, Long> timestampsMap =
                Collections.synchronizedMap(new LinkedHashMap<BlockInfo, Long>());
          

          The patch added a synchronized LinkedHashMap to record all under replicated block and the time of adding to the list. It increases NN's memory and affect few performance, especially when some DNs are decommissioning, the map may become large.
          On the other hand, honestly I don't see real value of adding this metric. The disadvantage is much larger than the benefit from my point of view, so please revert it, thanks.

          Show
          Yi Liu added a comment - -1 for the patch. Sorry I come late. Akira Ajisaka , could we revert it? /** Keep timestamp when a block is put into the queue. */ private final Map<BlockInfo, Long > timestampsMap = Collections.synchronizedMap( new LinkedHashMap<BlockInfo, Long >()); The patch added a synchronized LinkedHashMap to record all under replicated block and the time of adding to the list. It increases NN's memory and affect few performance, especially when some DNs are decommissioning, the map may become large. On the other hand, honestly I don't see real value of adding this metric. The disadvantage is much larger than the benefit from my point of view, so please revert it, thanks.
          Hide
          Akira Ajisaka added a comment -

          -1 for the patch. Sorry I come late. Akira AJISAKA, could we revert it?

          Agree that this patch affects performance. I'll revert it.

          On the other hand, honestly I don't see real value of adding this metric. The disadvantage is much larger than the benefit from my point of view

          Do you have any idea to deal with the issue in the description instead of using the metric?

          Show
          Akira Ajisaka added a comment - -1 for the patch. Sorry I come late. Akira AJISAKA, could we revert it? Agree that this patch affects performance. I'll revert it. On the other hand, honestly I don't see real value of adding this metric. The disadvantage is much larger than the benefit from my point of view Do you have any idea to deal with the issue in the description instead of using the metric?
          Hide
          Akira Ajisaka added a comment -

          Reverted this patch from trunk and branch-2.

          Show
          Akira Ajisaka added a comment - Reverted this patch from trunk and branch-2.
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in Hadoop-trunk-Commit #8237 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8237/)
          Revert "HDFS-6682. Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka)" (aajisaka: rev 2a1d656196cf9750fa482cb10893684e8a2ce7c3)

          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
          Show
          Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #8237 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8237/ ) Revert " HDFS-6682 . Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka)" (aajisaka: rev 2a1d656196cf9750fa482cb10893684e8a2ce7c3) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #271 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/271/)
          Revert "HDFS-6682. Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka)" (aajisaka: rev 2a1d656196cf9750fa482cb10893684e8a2ce7c3)

          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          Show
          Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #271 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/271/ ) Revert " HDFS-6682 . Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka)" (aajisaka: rev 2a1d656196cf9750fa482cb10893684e8a2ce7c3) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Yarn-trunk #1001 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1001/)
          Revert "HDFS-6682. Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka)" (aajisaka: rev 2a1d656196cf9750fa482cb10893684e8a2ce7c3)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          Show
          Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk #1001 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1001/ ) Revert " HDFS-6682 . Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka)" (aajisaka: rev 2a1d656196cf9750fa482cb10893684e8a2ce7c3) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2198 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2198/)
          Revert "HDFS-6682. Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka)" (aajisaka: rev 2a1d656196cf9750fa482cb10893684e8a2ce7c3)

          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2198 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2198/ ) Revert " HDFS-6682 . Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka)" (aajisaka: rev 2a1d656196cf9750fa482cb10893684e8a2ce7c3) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #260 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/260/)
          Revert "HDFS-6682. Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka)" (aajisaka: rev 2a1d656196cf9750fa482cb10893684e8a2ce7c3)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java
          • hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #260 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/260/ ) Revert " HDFS-6682 . Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka)" (aajisaka: rev 2a1d656196cf9750fa482cb10893684e8a2ce7c3) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
          Hide
          Allen Wittenauer added a comment -

          On the other hand, honestly I don't see real value of adding this metric.

          There's a big advantage to having this metric: it's extremely useful to know how backlogged the replication queue is as a determinant of namenode health on extremely large clusters.

          Show
          Allen Wittenauer added a comment - On the other hand, honestly I don't see real value of adding this metric. There's a big advantage to having this metric: it's extremely useful to know how backlogged the replication queue is as a determinant of namenode health on extremely large clusters.
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #268 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/268/)
          Revert "HDFS-6682. Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka)" (aajisaka: rev 2a1d656196cf9750fa482cb10893684e8a2ce7c3)

          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #268 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/268/ ) Revert " HDFS-6682 . Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka)" (aajisaka: rev 2a1d656196cf9750fa482cb10893684e8a2ce7c3) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2217 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2217/)
          Revert "HDFS-6682. Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka)" (aajisaka: rev 2a1d656196cf9750fa482cb10893684e8a2ce7c3)

          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2217 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2217/ ) Revert " HDFS-6682 . Add a metric to expose the timestamp of the oldest under-replicated block. (aajisaka)" (aajisaka: rev 2a1d656196cf9750fa482cb10893684e8a2ce7c3) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          Yi Liu added a comment -

          There's a big advantage to having this metric: it's extremely useful to know how backlogged the replication queue is as a determinant of namenode health on extremely large clusters.

          We have many ways to know about namenode health or in heavy load.
          It's not worth.

          Thanks Akira for reverting.

          Show
          Yi Liu added a comment - There's a big advantage to having this metric: it's extremely useful to know how backlogged the replication queue is as a determinant of namenode health on extremely large clusters. We have many ways to know about namenode health or in heavy load. It's not worth. Thanks Akira for reverting.
          Hide
          Allen Wittenauer added a comment -

          We have many ways to know about namenode health or in heavy load. It's not worth.

          They don't work for this use case. We see this every day. NN is healthy except the replication queue is backed up.

          Show
          Allen Wittenauer added a comment - We have many ways to know about namenode health or in heavy load. It's not worth. They don't work for this use case. We see this every day . NN is healthy except the replication queue is backed up.
          Hide
          Akira Ajisaka added a comment -

          We have many ways to know about namenode health or in heavy load.

          This metric is to show the health not only for NameNode but also for the entire HDFS cluster.

          Show
          Akira Ajisaka added a comment - We have many ways to know about namenode health or in heavy load. This metric is to show the health not only for NameNode but also for the entire HDFS cluster.
          Hide
          Andrew Wang added a comment -

          Wondering if there's a lighterweight metric we could compute instead. Allen Wittenauer is this the entire queue being backed up, or a few super-old replicas that never get cleared? If it's the entire queue, maybe the rate of addition/removal from UnderReplicatedBlocks would be similarly useful, in addition to total size. Could provide sliding window metrics like NNTop. Doing this per-DN could also be interesting.

          Show
          Andrew Wang added a comment - Wondering if there's a lighterweight metric we could compute instead. Allen Wittenauer is this the entire queue being backed up, or a few super-old replicas that never get cleared? If it's the entire queue, maybe the rate of addition/removal from UnderReplicatedBlocks would be similarly useful, in addition to total size. Could provide sliding window metrics like NNTop. Doing this per-DN could also be interesting.
          Hide
          Allen Wittenauer added a comment -

          We have no insight into how old a given replication might have been hanging around so no way to really answer that question. We know it gets backed up during cascading DN failure events (thanks very slow NM memory checker+fast acting bad job+Linux OOM killer!), so I was always under the impression that it's just the whole queue is super busy vs. old ones never cleared. Rate might be useful to at least tell us if it is stuck and/or a project on how long the queue will remain behind.

          Show
          Allen Wittenauer added a comment - We have no insight into how old a given replication might have been hanging around so no way to really answer that question. We know it gets backed up during cascading DN failure events (thanks very slow NM memory checker+fast acting bad job+Linux OOM killer!), so I was always under the impression that it's just the whole queue is super busy vs. old ones never cleared. Rate might be useful to at least tell us if it is stuck and/or a project on how long the queue will remain behind.
          Hide
          Andrew Wang added a comment -

          Cool, thanks Allen. Maybe we file a new JIRA for this? It would also be useful when tuning replication rate limiting.

          Code-wise I think it would look a lot like this patch, unless we want sliding window fanciness. A count and rate of enqueued vs. processed would be a good start though.

          Show
          Andrew Wang added a comment - Cool, thanks Allen. Maybe we file a new JIRA for this? It would also be useful when tuning replication rate limiting. Code-wise I think it would look a lot like this patch, unless we want sliding window fanciness. A count and rate of enqueued vs. processed would be a good start though.
          Hide
          Yi Liu added a comment - - edited

          Thanks Allen, Andrew and Akira for the discussion.

          Our original intention is to solve issue which is good, thank you for working on it. About the discussion itself, Andrew's suggestion is good, and another option is to record latest time of UnderReplicatedBlocks#chooseUnderReplicatedBlocks, and we already have metrics about the underReplicatedBlocksCount/pendingReplicationBlocksCount/scheduledReplicationBlocksCount, so we can know whether/how long the under replica list is handled since last time if we really want to see. My point is not worth to record whole under replicated list for this metric.

          On the other hand, we have UnderReplicatedBlocks and PendingReplicationBlocks, right? Replication monitor thread will periodically pick up some under replicated blocks, unless the NN stops (e.g, full gc), compute replication work will always happen in some CPU time slice, of course it could be slow since there maybe many things need to be handled in NN (e.g. many requests). But actually if NN is slow, we have many ways to know it. About Akira's comment about the metric is also about the entire HDFS cluster, we talk DataNode here, I think more correctly thing it's to record the timeout number of pending replication blocks (PendingReplicationBlocks) if network is very busy or target DNs corrupted if we want to get the Cluster health from replication blocks' review, UnderReplicatedBlocks can't stand for that.

          So if we want to have some metrics about the replicated blocks in NN, let's find some lightweight way as suggested, thanks.

          Show
          Yi Liu added a comment - - edited Thanks Allen, Andrew and Akira for the discussion. Our original intention is to solve issue which is good, thank you for working on it. About the discussion itself, Andrew's suggestion is good, and another option is to record latest time of UnderReplicatedBlocks#chooseUnderReplicatedBlocks , and we already have metrics about the underReplicatedBlocksCount/pendingReplicationBlocksCount/scheduledReplicationBlocksCount , so we can know whether/how long the under replica list is handled since last time if we really want to see. My point is not worth to record whole under replicated list for this metric. On the other hand, we have UnderReplicatedBlocks and PendingReplicationBlocks , right? Replication monitor thread will periodically pick up some under replicated blocks, unless the NN stops (e.g, full gc), compute replication work will always happen in some CPU time slice, of course it could be slow since there maybe many things need to be handled in NN (e.g. many requests). But actually if NN is slow, we have many ways to know it. About Akira's comment about the metric is also about the entire HDFS cluster, we talk DataNode here, I think more correctly thing it's to record the timeout number of pending replication blocks ( PendingReplicationBlocks ) if network is very busy or target DNs corrupted if we want to get the Cluster health from replication blocks' review, UnderReplicatedBlocks can't stand for that. So if we want to have some metrics about the replicated blocks in NN, let's find some lightweight way as suggested, thanks.
          Hide
          Akira Ajisaka added a comment -

          Thanks Allen, Andrew, and Yi for the discussion.

          whole queue is super busy

          As Andrew suggested, recording the rate of addition/removal from UnderReplicatedBlocks would be useful and straightforward to me.

          old ones never cleared

          I agree with Yi that recording the timeout number of pending replication blocks is useful to get the cluster health.

          Show
          Akira Ajisaka added a comment - Thanks Allen, Andrew, and Yi for the discussion. whole queue is super busy As Andrew suggested, recording the rate of addition/removal from UnderReplicatedBlocks would be useful and straightforward to me. old ones never cleared I agree with Yi that recording the timeout number of pending replication blocks is useful to get the cluster health.
          Hide
          Akira Ajisaka added a comment -

          recording the timeout number of pending replication blocks is useful to get the cluster health.

          Filed HDFS-10341.

          Show
          Akira Ajisaka added a comment - recording the timeout number of pending replication blocks is useful to get the cluster health. Filed HDFS-10341 .
          Hide
          Akira Ajisaka added a comment -

          Closing this issue since HDFS-10341 was fixed.

          As Andrew suggested, recording the rate of addition/removal from UnderReplicatedBlocks would be useful and straightforward to me.

          If someone needs this, please create a separate jira and link to this issue.

          Show
          Akira Ajisaka added a comment - Closing this issue since HDFS-10341 was fixed. As Andrew suggested, recording the rate of addition/removal from UnderReplicatedBlocks would be useful and straightforward to me. If someone needs this, please create a separate jira and link to this issue.
          Hide
          Andrew Wang added a comment -

          Thanks Akira. I filed HDFS-11024 for adding rate metrics for recovery work.

          Show
          Andrew Wang added a comment - Thanks Akira. I filed HDFS-11024 for adding rate metrics for recovery work.

            People

            • Assignee:
              Akira Ajisaka
              Reporter:
              Akira Ajisaka
            • Votes:
              1 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development