Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.17.2
    • Fix Version/s: 0.20.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      Modified dfsadmin -report to report under replicated blocks. blocks with corrupt replicas, and missing blocks".

      Description

      A whole bunch of datanodes became dead because of some network problems resulting in heartbeat timeouts although datanodes were fine.

      Many processes started to fail because of the corrupted filesystem.

      In order to catch and diagnose such problems faster the namenode should detect the corruption automatically and provide a way to alert operations. At the minimum it should show the fact of corruption on the GUI.

      1. HADOOP-4103-branch-20.patch
        18 kB
        Raghu Angadi
      2. HADOOP-4103.patch
        19 kB
        Raghu Angadi
      3. HADOOP-4103.patch
        23 kB
        Raghu Angadi
      4. HADOOP-4103.patch
        22 kB
        Raghu Angadi
      5. HADOOP-4103.patch
        22 kB
        Raghu Angadi

        Activity

        Hide
        Raghu Angadi added a comment -

        > If I run the command twice successively within 10 seconds , each run shows different values, sometimes 20, sometimes 48, etc.etc.

        Does this always happen?

        But if you are seeing this now and then, it is expected. Note that the missing blocks are detected only by the replication monitor when it iterates once every few (5?) minutes. For accurate count you could use new RPC you added in another jira.

        It is certainly not because of locking. The unlocked part only does max(int1, int2). There are no other consistency requirements on returned value. volatile int won't help.

        Show
        Raghu Angadi added a comment - > If I run the command twice successively within 10 seconds , each run shows different values, sometimes 20, sometimes 48, etc.etc. Does this always happen? But if you are seeing this now and then, it is expected. Note that the missing blocks are detected only by the replication monitor when it iterates once every few (5?) minutes. For accurate count you could use new RPC you added in another jira. It is certainly not because of locking. The unlocked part only does max(int1, int2). There are no other consistency requirements on returned value. volatile int won't help.
        Hide
        dhruba borthakur added a comment -

        Hi Raghu: really appreciate it if you remember why BlocksManager.getMissingBlocksCount() does not do any locking?

        Show
        dhruba borthakur added a comment - Hi Raghu: really appreciate it if you remember why BlocksManager.getMissingBlocksCount() does not do any locking?
        Hide
        dhruba borthakur added a comment -

        I am seeing that the "dfsadmin -report" reports widely fluctuating values for missing blocks. If I run the command twice successively within 10 seconds , each run shows different values, sometimes 20, sometimes 48, etc.etc. Is it because the method BlocksManager.getMissingBlocksCount() does not do any locking? Will it help to declare "volatile" for BlockManager.missingBlocksInCurIter and BlockManager.missingBlocksInPrevIter?

        Show
        dhruba borthakur added a comment - I am seeing that the "dfsadmin -report" reports widely fluctuating values for missing blocks. If I run the command twice successively within 10 seconds , each run shows different values, sometimes 20, sometimes 48, etc.etc. Is it because the method BlocksManager.getMissingBlocksCount() does not do any locking? Will it help to declare "volatile" for BlockManager.missingBlocksInCurIter and BlockManager.missingBlocksInPrevIter?
        Hide
        Hudson added a comment -
        Show
        Hudson added a comment - Integrated in Hadoop-trunk #778 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/778/ )
        Hide
        Raghu Angadi added a comment -

        I just committed this.

        Show
        Raghu Angadi added a comment - I just committed this.
        Hide
        Raghu Angadi added a comment -

        Patch 0.20 is attached. The trunk patch conflicts with 0.20.

        Show
        Raghu Angadi added a comment - Patch 0.20 is attached. The trunk patch conflicts with 0.20.
        Hide
        Raghu Angadi added a comment -

        Failed contrib test is a known issue : HADOOP-5068

        Show
        Raghu Angadi added a comment - Failed contrib test is a known issue : HADOOP-5068
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12401261/HADOOP-4103.patch
        against trunk revision 749318.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 11 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 2 new Findbugs warnings.

        +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/40/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/40/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/40/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/40/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12401261/HADOOP-4103.patch against trunk revision 749318. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 11 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/40/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/40/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/40/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/40/console This message is automatically generated.
        Hide
        Raghu Angadi added a comment -

        If there are no objections, I am planning to commit this to 0.20.

        This is a pretty useful feature for admins and is pretty safe patch. Please let me know if there are concerns.

        Show
        Raghu Angadi added a comment - If there are no objections, I am planning to commit this to 0.20. This is a pretty useful feature for admins and is pretty safe patch. Please let me know if there are concerns.
        Hide
        Raghu Angadi added a comment -

        minor fix to a string in the unit test.

        Show
        Raghu Angadi added a comment - minor fix to a string in the unit test.
        Hide
        Raghu Angadi added a comment -

        I forgot to run the test again after the changes to patch based on review.

        Show
        Raghu Angadi added a comment - I forgot to run the test again after the changes to patch based on review.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12401076/HADOOP-4103.patch
        against trunk revision 748861.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 11 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 2 new Findbugs warnings.

        +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/26/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/26/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/26/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/26/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12401076/HADOOP-4103.patch against trunk revision 748861. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 11 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/26/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/26/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/26/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/26/console This message is automatically generated.
        Hide
        Bill Au added a comment -

        I think this feature is very useful and would like to see it for 0.20 too.

        Show
        Bill Au added a comment - I think this feature is very useful and would like to see it for 0.20 too.
        Hide
        Raghu Angadi added a comment -

        I hope this gets marked for 0.20. It is pretty safe. Otherwise , I am pretty sure I will have to back port it again in near future and duplicate considerable constant effort associated with a new jira and a commit.

        Show
        Raghu Angadi added a comment - I hope this gets marked for 0.20. It is pretty safe. Otherwise , I am pretty sure I will have to back port it again in near future and duplicate considerable constant effort associated with a new jira and a commit.
        Hide
        Suresh Srinivas added a comment -

        +1 for the patch

        Show
        Suresh Srinivas added a comment - +1 for the patch
        Hide
        Raghu Angadi added a comment -

        Thanks Suresh.

        Attached patch fixes both. The new stat for corrupt block is not required since it is already there. I didn't see that earlier.

        Show
        Raghu Angadi added a comment - Thanks Suresh. Attached patch fixes both. The new stat for corrupt block is not required since it is already there. I didn't see that earlier.
        Hide
        Suresh Srinivas added a comment -

        Comments:

        1. DFSAdmin.java please remove the space before : in the newly introduced output
        2. NameNodeMetrics.numBlocksCorrupted exposes the same data as FSNamesystemMetrics.corruptReplicaBlocks. Not sure where the new metrics introduced by this patch should go into
        Show
        Suresh Srinivas added a comment - Comments: DFSAdmin.java please remove the space before : in the newly introduced output NameNodeMetrics.numBlocksCorrupted exposes the same data as FSNamesystemMetrics.corruptReplicaBlocks . Not sure where the new metrics introduced by this patch should go into
        Hide
        Raghu Angadi added a comment -

        Thanks Suresh.

        Updated patch includes all the suggestions.

        'dfsadmin -report' now prints 3 extra lines one for each of "Under replicated blocks" "Blocks with corrupt replicas" "Missing blocks". The last two counts should be zero normally. The first count should be low and should keep going down.

        Regd whether it should be treated as "imcompatible" change.. I personally don't think so. But does not matter either way.

        Show
        Raghu Angadi added a comment - Thanks Suresh. Updated patch includes all the suggestions. 'dfsadmin -report' now prints 3 extra lines one for each of "Under replicated blocks" "Blocks with corrupt replicas" "Missing blocks". The last two counts should be zero normally. The first count should be low and should keep going down. Regd whether it should be treated as "imcompatible" change.. I personally don't think so. But does not matter either way.
        Hide
        Suresh Srinivas added a comment -

        1. NamenodeProtocol.getStats() method documentation needs to be updated about the fourth stat that is being reported
        2. DFSAdmin.java - remove space before : in "Missing Blocks (approx) : ". Additionally is it a good idea to print number of corrupt blocks, pending replication, scheduled replication and under replicated block counts in the report? Currently what is printed in dfsadmin report is also printed in the cluster summary on namenode web page. It may be a good idea to keep both of them consistent.
        3. FSNamesystem.java computeReplicationWork() move the added code block that sets missingBlocksInCurIter, missingBlocksInPrevIter to zero, above the comments preceding it.

        Would this change be incompatible because of change in the output of dfsadmin report command?

        Show
        Suresh Srinivas added a comment - 1. NamenodeProtocol.getStats() method documentation needs to be updated about the fourth stat that is being reported 2. DFSAdmin.java - remove space before : in "Missing Blocks (approx) : " . Additionally is it a good idea to print number of corrupt blocks, pending replication, scheduled replication and under replicated block counts in the report? Currently what is printed in dfsadmin report is also printed in the cluster summary on namenode web page. It may be a good idea to keep both of them consistent. 3. FSNamesystem.java computeReplicationWork() move the added code block that sets missingBlocksInCurIter, missingBlocksInPrevIter to zero, above the comments preceding it. Would this change be incompatible because of change in the output of dfsadmin report command?
        Hide
        Raghu Angadi added a comment -

        The patch for missing block alerts. A user can monitor this in multiple ways :

        1. 'bin/hdfs dfsadmin -report' reports this count.
        2. A warning is pasted in red on NameNode front page
        3. new stat is added (for Simon, for e.g.).
          • Also added a stat to report size of corrupt replicas map

        Once the alert is noticed, admin can run 'dfsadmin -metasave' to find out which specific blocks are missing. 'metasave' is improved a bit to list replica info for each block in 'neededReplication' list and the line for a missing blocks contains the word "MISSING".

        This is a very non-intrusive change, thus fairly safe for backporting. No new state or data structures for NN to track.

        Show
        Raghu Angadi added a comment - The patch for missing block alerts. A user can monitor this in multiple ways : 'bin/hdfs dfsadmin -report' reports this count. A warning is pasted in red on NameNode front page new stat is added (for Simon, for e.g.). Also added a stat to report size of corrupt replicas map Once the alert is noticed, admin can run 'dfsadmin -metasave' to find out which specific blocks are missing. 'metasave' is improved a bit to list replica info for each block in 'neededReplication' list and the line for a missing blocks contains the word "MISSING". This is a very non-intrusive change, thus fairly safe for backporting. No new state or data structures for NN to track.
        Hide
        Raghu Angadi added a comment - - edited

        (Edit : formatting only)

        The scope of the fix is narrowed to the following :

        • NameNode webui shows in (probably in red) indicating if there are any missing blocks.
          • will mostly add simon stats for such a number.
        • 'dfsadmin -metasave' can be used to find all the missing blocks
          • a later jira will enhance -metasave or have different command that is more user friendly. currently -metasave is mainly meant for developers.

        For this to be a straight forward fix, I need to make one policy change: currently if a block does not have any good replicas left it is not included in "neededReplications" list. I think this was done mainly as an "optimization". But a cluster should not have any blocks this state. even 'neededReplications' name implies such blocks should be included. It would be better if I don't need to add another list that need to be maintained.

        Show
        Raghu Angadi added a comment - - edited (Edit : formatting only) The scope of the fix is narrowed to the following : NameNode webui shows in (probably in red) indicating if there are any missing blocks. will mostly add simon stats for such a number. 'dfsadmin -metasave' can be used to find all the missing blocks a later jira will enhance -metasave or have different command that is more user friendly. currently -metasave is mainly meant for developers. For this to be a straight forward fix, I need to make one policy change: currently if a block does not have any good replicas left it is not included in "neededReplications" list. I think this was done mainly as an "optimization". But a cluster should not have any blocks this state. even 'neededReplications' name implies such blocks should be included. It would be better if I don't need to add another list that need to be maintained.
        Hide
        Raghu Angadi added a comment -

        I thinking of implementing a background fsck on NameNode. This will share/reuse most of the code with current Fsck. The extra features will be to facilitate an admin to quickly check if there something odd (e.g. ability list last 100 or so blocks in inconsistent state).

        Based on this background check there could be further improvements to monitoring more alarms over time.. as well as reducing latency of detection.

        This feature will be optional. Scan period could be around a day.

        Show
        Raghu Angadi added a comment - I thinking of implementing a background fsck on NameNode. This will share/reuse most of the code with current Fsck. The extra features will be to facilitate an admin to quickly check if there something odd (e.g. ability list last 100 or so blocks in inconsistent state). Based on this background check there could be further improvements to monitoring more alarms over time.. as well as reducing latency of detection. This feature will be optional. Scan period could be around a day.

          People

          • Assignee:
            Raghu Angadi
            Reporter:
            Christian Kunz
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development