Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-457

better handling of volume failure in Data Node storage

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.20.203.0, 0.21.0
    • Component/s: datanode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Datanode can continue if a volume for replica storage fails. Previously a datanode resigned if any volume failed.

      Description

      Current implementation shuts DataNode down completely when one of the configured volumes of the storage fails.
      This is rather wasteful behavior because it decreases utilization (good storage becomes unavailable) and imposes extra load on the system (replication of the blocks from the good volumes). These problems will become even more prominent when we move to mixed (heterogeneous) clusters with many more volumes per Data Node.

      1. HDFS_457.patch
        2 kB
        Jeff Zhang
      2. HDFS-457_20-append.patch
        27 kB
        Nicolas Spiegelberg
      3. HDFS-457.patch
        29 kB
        Boris Shkolnik
      4. HDFS-457-1.patch
        29 kB
        Boris Shkolnik
      5. HDFS-457-2.patch
        29 kB
        Boris Shkolnik
      6. HDFS-457-2.patch
        29 kB
        Boris Shkolnik
      7. HDFS-457-2.patch
        28 kB
        Boris Shkolnik
      8. HDFS-457-3.patch
        29 kB
        Boris Shkolnik
      9. HDFS-457-y20.patch
        15 kB
        Konstantin Shvachko
      10. jira.HDFS-457.branch-0.20-internal.patch
        16 kB
        Erik Steffl
      11. TestFsck.zip
        689 kB
        Tsz Wo Nicholas Sze

        Issue Links

          Activity

          Hide
          Boris Shkolnik added a comment -

          We should try to keep DataNode alive even if one of its volumes is not accessible. When DataNode handles the error we can check to see what percentage of the volumes is down. If it more then some predefined threshold we should shut the node down. But if it is not - we can keep it alive. In this case we need to do the following:
          o remove the volume from the list of valid volumes
          o go over all the blocks and remove those that reside on this volume
          o immediately schedule a block report to update namenode and start replication.
          o Optional. We can try to monitor removed volumes (or periodically compare valid ones against the configured ones), and if some of them comes back to live (or on an operator command) we may try to restore it. (I don't know if it is possible and plausible, but it can be designed/done as the next step).

          Affected classes and methods:

          • BlockReceiver constructor
          • BlockReceiver run()
          • BlockReceiver lastDataNodeRun()
          • FSDataset, FSVolume, FSVolumeSet, FSDir
          Show
          Boris Shkolnik added a comment - We should try to keep DataNode alive even if one of its volumes is not accessible. When DataNode handles the error we can check to see what percentage of the volumes is down. If it more then some predefined threshold we should shut the node down. But if it is not - we can keep it alive. In this case we need to do the following: o remove the volume from the list of valid volumes o go over all the blocks and remove those that reside on this volume o immediately schedule a block report to update namenode and start replication. o Optional. We can try to monitor removed volumes (or periodically compare valid ones against the configured ones), and if some of them comes back to live (or on an operator command) we may try to restore it. (I don't know if it is possible and plausible, but it can be designed/done as the next step). Affected classes and methods: BlockReceiver constructor BlockReceiver run() BlockReceiver lastDataNodeRun() FSDataset, FSVolume, FSVolumeSet, FSDir
          Hide
          dhruba borthakur added a comment -

          Another big issue is "reporting". In the current code, an adminstrator sees the dead Datanode in the webui and knows that he/she has to fix it.

          In your proposal, a datanode could stop using a portion of its disk and it would be very worthwhile to report this to the adminstrator. Otherwise, a certain percentage of disk space could leak out of HDFS usage without anybody knowing about it.

          Show
          dhruba borthakur added a comment - Another big issue is "reporting". In the current code, an adminstrator sees the dead Datanode in the webui and knows that he/she has to fix it. In your proposal, a datanode could stop using a portion of its disk and it would be very worthwhile to report this to the adminstrator. Otherwise, a certain percentage of disk space could leak out of HDFS usage without anybody knowing about it.
          Hide
          Boris Shkolnik added a comment -

          good point.
          I will report to NN using errorReport interface. Currently if NN received DatanodeProtocol.DiskError message it removes the datanode.
          I will introduce a DatanodeProtocol.FatalDiskError - this should remove the datanode. But DiskError will just cause NN to log a WARN message about failed volume.

          Show
          Boris Shkolnik added a comment - good point. I will report to NN using errorReport interface. Currently if NN received DatanodeProtocol.DiskError message it removes the datanode. I will introduce a DatanodeProtocol.FatalDiskError - this should remove the datanode. But DiskError will just cause NN to log a WARN message about failed volume.
          Hide
          Boris Shkolnik added a comment -

          we should also handle failed reads from the failed volumes.
          we can take care of it in FSDataset.validateBlockFile() because it is called on every "read" operation.

          Show
          Boris Shkolnik added a comment - we should also handle failed reads from the failed volumes. we can take care of it in FSDataset.validateBlockFile() because it is called on every "read" operation.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12413868/HDFS-457.patch
          against trunk revision 794953.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 7 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 1 new Findbugs warnings.

          -1 release audit. The applied patch generated 285 release audit warnings (more than the trunk's current 282 warnings).

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/23/testReport/
          Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/23/artifact/trunk/current/releaseAuditDiffWarnings.txt
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/23/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/23/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/23/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12413868/HDFS-457.patch against trunk revision 794953. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified tests. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs warnings. -1 release audit. The applied patch generated 285 release audit warnings (more than the trunk's current 282 warnings). -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/23/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/23/artifact/trunk/current/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/23/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/23/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/23/console This message is automatically generated.
          Hide
          Tsz Wo Nicholas Sze added a comment -
          • There is a second version of LOG.warn(..) which accepts an Exception object as a parameter. You may consider changing
            +      DataNode.LOG.warn("IOException in BlockReceiver constructor. Cause is " + 
            +          cause);
            

            to

            +      DataNode.LOG.warn("IOException in BlockReceiver constructor.", cause);
            

            and the other LOG.warn/info/err calls.

          • DataNode.checkDiskError() should not be public.
          • The return type of checkDirs() shuold be List<FSVolume>.
          • Looks like that removed_vols.size() is equal to removed, or rmoved_vols == null when removed == 0. So we may drop removed.
          • We may omit toSting() calls in the following:
            +          DataNode.LOG.warn("Removing failed volume " + fsv.toString() + " - " + e.getLocalizedMessage());
            
            +          "volumes. List of current volumes: " +   toString());
            
          • It is hard to related the name "keepRunning" with "checks how many valid storage volumes are there in the DataNode". How about change it to something like "hasEnoughVolumes", "hasEnoughResource" or "isSystemHealthy?
          • The second "f = null" is redundant since f == null if validateBlockFile throws an exception.
            +    File f = null;;
            +    try {
            +      f = validateBlockFile(b);
            +    } catch(IOException e) {
            +      f = null;
            +    }
            

            I suggest to print some messages in the catch block.

          • checkDataDir() changes volumeMap but is not synchronized on it.
          • Also, would the new checkDataDir() implementation take a long time to execute? We may need some performance test on this.
          • It is better not to change the existing constant values.
            -  final static int DISK_ERROR = 1;
            -  final static int INVALID_BLOCK = 2;
            +  final static int DISK_ERROR = 1; // there are still valid volumes on DN
            +  final static int FATAL_DISK_ERROR = 2; // no valid volumes left on DN
            +  final static int INVALID_BLOCK = 3;
            

            How about set FATAL_DISK_ERROR to 3?

          Show
          Tsz Wo Nicholas Sze added a comment - There is a second version of LOG.warn(..) which accepts an Exception object as a parameter. You may consider changing + DataNode.LOG.warn( "IOException in BlockReceiver constructor. Cause is " + + cause); to + DataNode.LOG.warn( "IOException in BlockReceiver constructor." , cause); and the other LOG.warn/info/err calls. DataNode.checkDiskError() should not be public. The return type of checkDirs() shuold be List<FSVolume>. Looks like that removed_vols.size() is equal to removed, or rmoved_vols == null when removed == 0. So we may drop removed. We may omit toSting() calls in the following: + DataNode.LOG.warn( "Removing failed volume " + fsv.toString() + " - " + e.getLocalizedMessage()); + "volumes. List of current volumes: " + toString()); It is hard to related the name "keepRunning" with "checks how many valid storage volumes are there in the DataNode". How about change it to something like "hasEnoughVolumes", "hasEnoughResource" or "isSystemHealthy? The second "f = null" is redundant since f == null if validateBlockFile throws an exception. + File f = null ;; + try { + f = validateBlockFile(b); + } catch (IOException e) { + f = null ; + } I suggest to print some messages in the catch block. checkDataDir() changes volumeMap but is not synchronized on it. Also, would the new checkDataDir() implementation take a long time to execute? We may need some performance test on this. It is better not to change the existing constant values. - final static int DISK_ERROR = 1; - final static int INVALID_BLOCK = 2; + final static int DISK_ERROR = 1; // there are still valid volumes on DN + final static int FATAL_DISK_ERROR = 2; // no valid volumes left on DN + final static int INVALID_BLOCK = 3; How about set FATAL_DISK_ERROR to 3?
          Hide
          Boris Shkolnik added a comment -

          thanks for the comments.

          >> * Also, would the new checkDataDir() implementation take a long time to execute? We may need some performance test on this.).
          This function makes one loop over ALL the blocks. I believe this is what causes the concern. I've looked at it and ran some tests.
          1. This loop will happen only in case when there is an actual failure. I think it is expected to have some extra work to be done in this case.
          2. This extra work shouldn't be so bad. I've ran test for 1500 blocks it took about 600mls, for 5000 about 800mls. For huge nodes with hundreds of thousands blocks it still should be tolerable.

          Show
          Boris Shkolnik added a comment - thanks for the comments. >> * Also, would the new checkDataDir() implementation take a long time to execute? We may need some performance test on this.). This function makes one loop over ALL the blocks. I believe this is what causes the concern. I've looked at it and ran some tests. 1. This loop will happen only in case when there is an actual failure. I think it is expected to have some extra work to be done in this case. 2. This extra work shouldn't be so bad. I've ran test for 1500 blocks it took about 600mls, for 5000 about 800mls. For huge nodes with hundreds of thousands blocks it still should be tolerable.
          Hide
          Boris Shkolnik added a comment -

          fixed hudson errors
          and implemented review suggestions.

          Show
          Boris Shkolnik added a comment - fixed hudson errors and implemented review suggestions.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12415899/HDFS-457-1.patch
          against trunk revision 802264.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 7 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/50/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/50/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/50/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/50/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12415899/HDFS-457-1.patch against trunk revision 802264. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/50/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/50/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/50/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/50/console This message is automatically generated.
          Hide
          Edward Capriolo added a comment -

          Also , I just ran into this . One of my disks failed however it was available on the system as read only. Some of the directories produce io errors when read. Will this patch be able to deal with that condition?

          Show
          Edward Capriolo added a comment - Also , I just ran into this . One of my disks failed however it was available on the system as read only. Some of the directories produce io errors when read. Will this patch be able to deal with that condition?
          Hide
          Boris Shkolnik added a comment -

          Disk is checked by DiskChecker.checkDir(). It checks mkdir, read and write. So in this case it should consider the volume as failed (because it cannot write to it).

          Show
          Boris Shkolnik added a comment - Disk is checked by DiskChecker.checkDir(). It checks mkdir, read and write. So in this case it should consider the volume as failed (because it cannot write to it).
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12416492/HDFS-457-2.patch
          against trunk revision 803973.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 7 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/66/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/66/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/66/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/66/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12416492/HDFS-457-2.patch against trunk revision 803973. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/66/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/66/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/66/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/66/console This message is automatically generated.
          Hide
          Boris Shkolnik added a comment -

          fixed two issues:
          1. make sure that blockreport is sent before starting verification
          2. delete files in the directory instead of the whole directory (because deleting
          directory with locked files in it may fail on Windows or on NFS).

          Show
          Boris Shkolnik added a comment - fixed two issues: 1. make sure that blockreport is sent before starting verification 2. delete files in the directory instead of the whole directory (because deleting directory with locked files in it may fail on Windows or on NFS).
          Hide
          Boris Shkolnik added a comment -

          the failed test, testFsckMove, has timed out.

          Show
          Boris Shkolnik added a comment - the failed test, testFsckMove, has timed out.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          TestFsck.zip: downloaded the log file from the build and extracted TestFsck log.

          The log is unexpectedly long, which has 34MB. Could you take a look, Boris?

          Show
          Tsz Wo Nicholas Sze added a comment - TestFsck.zip: downloaded the log file from the build and extracted TestFsck log. The log is unexpectedly long, which has 34MB. Could you take a look, Boris?
          Hide
          Boris Shkolnik added a comment -

          re-uploading the patch to get it run by hudson.

          Show
          Boris Shkolnik added a comment - re-uploading the patch to get it run by hudson.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12416633/HDFS-457-2.patch
          against trunk revision 804127.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 7 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/69/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/69/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/69/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/69/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12416633/HDFS-457-2.patch against trunk revision 804127. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/69/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/69/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/69/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/69/console This message is automatically generated.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          A few nits:

          • change numberOfVolumes() to private
          • keep handleDiskError(String errMsgr) private
          • "shutdonw" should be "shutdown"
            +    // to shutdonw DN completely and don't want NN to remove it.
            
          Show
          Tsz Wo Nicholas Sze added a comment - A few nits: change numberOfVolumes() to private keep handleDiskError(String errMsgr) private "shutdonw" should be "shutdown" + // to shutdonw DN completely and don't want NN to remove it.
          Hide
          Mahadev konar added a comment -

          i would suggest fixing the spelling for shutdown as well

          Show
          Mahadev konar added a comment - i would suggest fixing the spelling for shutdown as well
          Hide
          Mahadev konar added a comment -

          sorry ignore my comment, didnt read nicholas's comments...

          Show
          Mahadev konar added a comment - sorry ignore my comment, didnt read nicholas's comments...
          Hide
          Boris Shkolnik added a comment -

          implemented comments by Nicholas.

          Show
          Boris Shkolnik added a comment - implemented comments by Nicholas.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          +1 patch looks good.

          The diff between HDFS-457-3.patch and HDFS-457-2.patch is minor. So I think we don't have to submit it to Hudson again for saving some cycles.

          Show
          Tsz Wo Nicholas Sze added a comment - +1 patch looks good. The diff between HDFS-457 -3.patch and HDFS-457 -2.patch is minor. So I think we don't have to submit it to Hudson again for saving some cycles.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I have committed this. Thanks, Boris!

          Show
          Tsz Wo Nicholas Sze added a comment - I have committed this. Thanks, Boris!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #53 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/53/)
          . Do not shutdown datanode if some, but not all, volumns fail. Contributed by Boris Shkolnik

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #53 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/53/ ) . Do not shutdown datanode if some, but not all, volumns fail. Contributed by Boris Shkolnik
          Hide
          Tsz Wo Nicholas Sze added a comment -

          The committed patch incorrectly uses org.mortbay.log.Log. Created HDFS-612.

          Show
          Tsz Wo Nicholas Sze added a comment - The committed patch incorrectly uses org.mortbay.log.Log. Created HDFS-612 .
          Hide
          Robert Chansler added a comment -

          Editorial pass over all release notes prior to publication of 0.21.

          Show
          Robert Chansler added a comment - Editorial pass over all release notes prior to publication of 0.21.
          Hide
          Nicolas Spiegelberg added a comment -

          Should add to 0.20-append for durability

          Show
          Nicolas Spiegelberg added a comment - Should add to 0.20-append for durability
          Hide
          Nicolas Spiegelberg added a comment -

          New patch should apply cleanly to 0.20-append. Also imported the VolumeFailure unit test

          Show
          Nicolas Spiegelberg added a comment - New patch should apply cleanly to 0.20-append. Also imported the VolumeFailure unit test
          Hide
          Todd Lipcon added a comment -

          Removing 0.20-append tag - this isn't append-specific.

          Show
          Todd Lipcon added a comment - Removing 0.20-append tag - this isn't append-specific.
          Hide
          Jeff Zhang added a comment -

          I noticed that only the BlockSender will do checkDisk, so only read operation will make the namenode know that one disk is failed. Has anyone try this patch ? I doubt there will still be some problems with this patch.

          In the testcase TestDataNodeVolumeFailure, I make a little changes, replace the triggerFailure with a write operation (in this case the datanode do not know one volume is failed, and then the only one replica will be write successfully, and waiting for another replica copied from another datanode for a long time.

          Show
          Jeff Zhang added a comment - I noticed that only the BlockSender will do checkDisk, so only read operation will make the namenode know that one disk is failed. Has anyone try this patch ? I doubt there will still be some problems with this patch. In the testcase TestDataNodeVolumeFailure, I make a little changes, replace the triggerFailure with a write operation (in this case the datanode do not know one volume is failed, and then the only one replica will be write successfully, and waiting for another replica copied from another datanode for a long time.
          Hide
          Jeff Zhang added a comment -

          Attach patch. Do checkDiskError in BlockReceiver before allocating volumes for the new block. So after checkDiskError, it is guaranteed that the volumes are all normal, the failed volumes has been removed.

          Show
          Jeff Zhang added a comment - Attach patch. Do checkDiskError in BlockReceiver before allocating volumes for the new block. So after checkDiskError, it is guaranteed that the volumes are all normal, the failed volumes has been removed.
          Hide
          Jeff Zhang added a comment -

          This is my first patch on HDFS, not sure whether it is right to attach the patch here. or do need to create a new jira for this issue ?

          Show
          Jeff Zhang added a comment - This is my first patch on HDFS, not sure whether it is right to attach the patch here. or do need to create a new jira for this issue ?
          Hide
          Eli Collins added a comment -

          Hey Jeff,

          Nice catch. Please file a new jira.

          Thanks,
          Eli

          Show
          Eli Collins added a comment - Hey Jeff, Nice catch. Please file a new jira. Thanks, Eli
          Hide
          Jeff Zhang added a comment -

          Create Jira HDFS-1273 for this issue, and put patch there

          Show
          Jeff Zhang added a comment - Create Jira HDFS-1273 for this issue, and put patch there
          Hide
          Uma Mahesh added a comment -

          Hi,
          checkDirs will throw null pointer exception.
          ArrayList<FSVolume> removed_vols = null;
          ........
          ........catch(.....)

          { removed_vols = new ArrayList(); }

          ..........
          removed_vols .size ()

          if no exception then above line will throqw null pointer exception.

          Show
          Uma Mahesh added a comment - Hi, checkDirs will throw null pointer exception. ArrayList<FSVolume> removed_vols = null; ........ ........catch(.....) { removed_vols = new ArrayList(); } .......... removed_vols .size () if no exception then above line will throqw null pointer exception.
          Hide
          Eli Collins added a comment -

          Hey Uma,

          Think you're looking at an old patch, all the accesses of removedVols in checkDirs are first checked against null in trunk.

          Thanks,
          Eli

          Show
          Eli Collins added a comment - Hey Uma, Think you're looking at an old patch, all the accesses of removedVols in checkDirs are first checked against null in trunk. Thanks, Eli
          Hide
          Todd Lipcon added a comment -

          Marking fixed in 0.20.203 as well, since it was long ago committed to branch-0.20-security in time for that release

          Show
          Todd Lipcon added a comment - Marking fixed in 0.20.203 as well, since it was long ago committed to branch-0.20-security in time for that release

            People

            • Assignee:
              Boris Shkolnik
              Reporter:
              Boris Shkolnik
            • Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development