Hadoop Common
  1. Hadoop Common
  2. HADOOP-3193

Discovery of corrupt block reported in name node log

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.18.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      Added reporter to FSNamesystem stateChangeLog, and a new metric to track the number of corrupted replicas.

      Description

      Any discovery of a corrupt/unreadable block must be reported in the name node log.

      1. 3193-2.patch
        3 kB
        Chris Douglas
      2. 3193-1.patch
        2 kB
        Chris Douglas
      3. 3193-0.patch
        2 kB
        Chris Douglas

        Issue Links

          Activity

          Hide
          dhruba borthakur added a comment -

          When a client discovers a corrupt block, it reports it to the namenode. The Namenode logs a "ReportBadBlock message" it in the namenode log. One improvement would be to enhance the log message to print the blockId(s) as well!

          Another improvement would be to report the number of corrupted blocks through the HadoopMetrics API.

          Show
          dhruba borthakur added a comment - When a client discovers a corrupt block, it reports it to the namenode. The Namenode logs a "ReportBadBlock message" it in the namenode log. One improvement would be to enhance the log message to print the blockId(s) as well! Another improvement would be to report the number of corrupted blocks through the HadoopMetrics API.
          Hide
          dhruba borthakur added a comment -

          I am marking this as an incompatible change, especially because a new HadoopMetric config file needs to be deployed to existing clusters to display "BlocksCorrupted".

          Show
          dhruba borthakur added a comment - I am marking this as an incompatible change, especially because a new HadoopMetric config file needs to be deployed to existing clusters to display "BlocksCorrupted".
          Hide
          Tsz Wo Nicholas Sze added a comment -

          What are the cases that a client (non-datanode client) should call reportBadBlocks(...)? I am concerned about the security issue.

          Show
          Tsz Wo Nicholas Sze added a comment - What are the cases that a client (non-datanode client) should call reportBadBlocks(...)? I am concerned about the security issue.
          Hide
          Lohit Vijayarenu added a comment -

          +1 patch looks good. One small thing, the metric seem to report number of corrupt blocks reported over time. Should it be changed to number of corrupt blocks in the system at any point of time, possibly using MetricsIntValue. And also, namesystem.markBlockAsCorrupt logs a message about this inside corruptReplicas.addToCorruptReplicasMap function.

          Show
          Lohit Vijayarenu added a comment - +1 patch looks good. One small thing, the metric seem to report number of corrupt blocks reported over time. Should it be changed to number of corrupt blocks in the system at any point of time, possibly using MetricsIntValue. And also, namesystem.markBlockAsCorrupt logs a message about this inside corruptReplicas.addToCorruptReplicasMap function.
          Hide
          Chris Douglas added a comment -

          Lohit is right about the logging; it's redundant since HADOOP-2065.

          Canceling this patch until we decide what to do with the metric.

          Show
          Chris Douglas added a comment - Lohit is right about the logging; it's redundant since HADOOP-2065 . Canceling this patch until we decide what to do with the metric.
          Hide
          Sameer Paranjpye added a comment -

          I'd like to do more here. In addition to reporting corrupt blocks in the log, the Namenode should try and determine where the corruption occured i.e. on disk on the Datanode vs elsewhere (network transmission or in memory on the client).

          Show
          Sameer Paranjpye added a comment - I'd like to do more here. In addition to reporting corrupt blocks in the log, the Namenode should try and determine where the corruption occured i.e. on disk on the Datanode vs elsewhere (network transmission or in memory on the client).
          Hide
          Chris Douglas added a comment -

          Revised to include Lohit's feedback

          Show
          Chris Douglas added a comment - Revised to include Lohit's feedback
          Hide
          Lohit Vijayarenu added a comment -

          +1 Patch looks good

          Show
          Lohit Vijayarenu added a comment - +1 Patch looks good
          Hide
          Chris Douglas added a comment -

          Fixed findbugs warning

          Show
          Chris Douglas added a comment - Fixed findbugs warning
          Hide
          Chris Douglas added a comment - - edited
               [exec] -1 overall.  
          
               [exec]     +1 @author.  The patch does not contain any @author tags.
          
               [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
               [exec]                         Please justify why no tests are needed for this patch.
          
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
          
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
          
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
          

          No tests are included, as the change is only to logging and adding a metric.

          [ edit - all dfs tests pass on my machine ]

          Show
          Chris Douglas added a comment - - edited [exec] -1 overall. [exec] +1 @author. The patch does not contain any @author tags. [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. No tests are included, as the change is only to logging and adding a metric. [ edit - all dfs tests pass on my machine ]
          Hide
          Chris Douglas added a comment -

          In the future, it would be helpful if we included not only where the error occured, but more details about the particular error. Created HADOOP-3510 to track this improvement.

          Show
          Chris Douglas added a comment - In the future, it would be helpful if we included not only where the error occured, but more details about the particular error. Created HADOOP-3510 to track this improvement.
          Hide
          Chris Douglas added a comment -

          I just committed this.

          Show
          Chris Douglas added a comment - I just committed this.

            People

            • Assignee:
              Chris Douglas
              Reporter:
              Robert Chansler
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development