Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-528

Add ability for safemode to wait for a minimum number of live datanodes

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.22.0
    • Fix Version/s: 0.22.0, 1.1.1
    • Component/s: scripts
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      When starting up a fresh cluster programatically, users often want to wait until DFS is "writable" before continuing in a script. "dfsadmin -safemode wait" doesn't quite work for this on a completely fresh cluster, since when there are 0 blocks on the system, 100% of them are accounted for before any DNs have reported.

      This JIRA is to add a command which waits until a certain number of DNs have reported as alive to the NN.

      1. h528_20120731_b-1.patch
        9 kB
        Tsz Wo Nicholas Sze
      2. hdfs-528-v4.txt
        18 kB
        Eli Collins
      3. hdfs-528.txt
        17 kB
        Todd Lipcon
      4. hdfs-528-v3.txt
        15 kB
        Todd Lipcon
      5. hdfs-528-v2.txt
        15 kB
        Todd Lipcon
      6. hdfs-528.txt
        8 kB
        Todd Lipcon

        Activity

        Hide
        Todd Lipcon added a comment -

        This patch implements dfsadmin -waitDatanodes.

        For testing, I augmented TestHDFSCLI and also ran through the following on the command line:

        $ ./bin/hdfs dfsadmin -help waitDatanodes
        -waitDatanodes [timeout] [numNodes]: Wait for datanodes to report.
                        Waits for the specified number of Datanodes to report as alive to the
                        Namenode. A timeout can be specified in seconds - if the specified
                        number of nodes do not report within the timeout, returns a non-zero
                        exit code. If the number of nodes is not specified, defaults to the
                        value of dfs.replication.min. If the timeout is not specified, defaults
                        to 5 minutes.
        $ ./bin/hdfs dfsadmin -waitDatanodes
        SUCCESS: 1 datanodes have reported.
        $ echo $?
        0
        $ ./bin/hdfs dfsadmin -waitDatanodes 5 10
        Timed out waiting for 10 datanodes after 5 seconds
        $ echo $?
        1
        

        One example use case for this tool is in the EC2 scripts - even though the master has started up, the user needs to wait until at least a few DNs have reported before he or she can start writing to the cluster.

        Show
        Todd Lipcon added a comment - This patch implements dfsadmin -waitDatanodes. For testing, I augmented TestHDFSCLI and also ran through the following on the command line: $ ./bin/hdfs dfsadmin -help waitDatanodes -waitDatanodes [timeout] [numNodes]: Wait for datanodes to report. Waits for the specified number of Datanodes to report as alive to the Namenode. A timeout can be specified in seconds - if the specified number of nodes do not report within the timeout, returns a non-zero exit code. If the number of nodes is not specified, defaults to the value of dfs.replication.min. If the timeout is not specified, defaults to 5 minutes. $ ./bin/hdfs dfsadmin -waitDatanodes SUCCESS: 1 datanodes have reported. $ echo $? 0 $ ./bin/hdfs dfsadmin -waitDatanodes 5 10 Timed out waiting for 10 datanodes after 5 seconds $ echo $? 1 One example use case for this tool is in the EC2 scripts - even though the master has started up, the user needs to wait until at least a few DNs have reported before he or she can start writing to the cluster.
        Hide
        dhruba borthakur added a comment -

        Another generic approach is to specify the number of datanodes to wait for as a percentage of the total number of datanodes in a cluster. You would have to user the "includelist" feature of HDFS to list all the known datanodes (which most admins probably do). In fact, the NN may exit safemode only if the specified percentage of datanodes have checked in with the NN.

        Many times, when we restart our cluster, many datanodes fail to join the NN. However, the NN exists safemode because it finds at least one replica of every block. Then the NN starts replicating blocks. We have to manually enter safemode, manually look at the datanodes that have refuzed to join the NN, fix them and then exit safemode. Your proposed feature helps in elegantly handling this scenario.

        Show
        dhruba borthakur added a comment - Another generic approach is to specify the number of datanodes to wait for as a percentage of the total number of datanodes in a cluster. You would have to user the "includelist" feature of HDFS to list all the known datanodes (which most admins probably do). In fact, the NN may exit safemode only if the specified percentage of datanodes have checked in with the NN. Many times, when we restart our cluster, many datanodes fail to join the NN. However, the NN exists safemode because it finds at least one replica of every block. Then the NN starts replicating blocks. We have to manually enter safemode, manually look at the datanodes that have refuzed to join the NN, fix them and then exit safemode. Your proposed feature helps in elegantly handling this scenario.
        Hide
        Todd Lipcon added a comment -

        Dhruba: would it make sense to simply extend this feature so that the "nodes" parameter can be parsed if it is in the format "N%"? Or are you suggesting to actually modify the actual automatic safemode exit behavior with a new conf?

        Show
        Todd Lipcon added a comment - Dhruba: would it make sense to simply extend this feature so that the "nodes" parameter can be parsed if it is in the format "N%"? Or are you suggesting to actually modify the actual automatic safemode exit behavior with a new conf?
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12415554/hdfs-528.txt
        against trunk revision 801057.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 13 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/42/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/42/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/42/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/42/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12415554/hdfs-528.txt against trunk revision 801057. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/42/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/42/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/42/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/42/console This message is automatically generated.
        Hide
        dhruba borthakur added a comment -

        It would be nice if we can integrate it along with the safemode code in the NN. Then, no new command line utilities are needed.

        if one set a non-zero value of the new config parameter, then the NN will not exit safemode unless that many DNs have checked in.

        Show
        dhruba borthakur added a comment - It would be nice if we can integrate it along with the safemode code in the NN. Then, no new command line utilities are needed. if one set a non-zero value of the new config parameter, then the NN will not exit safemode unless that many DNs have checked in.
        Hide
        Todd Lipcon added a comment -

        Dhruba: would you prefer a percentage or an integer number to wait for? Or perhaps either, depending on the string format?

        I'm also worried a little bit about orthogonality between this and the "% of blocks to wait for reports from" feature. I assume we need an "AND" predicate of the two - but is there ever a case where we need an "OR"?

        Show
        Todd Lipcon added a comment - Dhruba: would you prefer a percentage or an integer number to wait for? Or perhaps either, depending on the string format? I'm also worried a little bit about orthogonality between this and the "% of blocks to wait for reports from" feature. I assume we need an "AND" predicate of the two - but is there ever a case where we need an "OR"?
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12415554/hdfs-528.txt
        against trunk revision 801500.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 13 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/43/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/43/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/43/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/43/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12415554/hdfs-528.txt against trunk revision 801500. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/43/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/43/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/43/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/43/console This message is automatically generated.
        Hide
        Todd Lipcon added a comment -

        Here is a new patch which solves the same problem, but does so by adding another condition for safemode during startup.

        A new conf dfs.safemode.min.datanodes determines how many datanodes it should wait to be "alive" before exiting safemode.

        To test this I updated TestSafeMode, and also moved that test into o.a.h.hdfs.server.namenode so that it could access the package-protected getSafeModeTip() function in FSNamesystem.

        Show
        Todd Lipcon added a comment - Here is a new patch which solves the same problem, but does so by adding another condition for safemode during startup. A new conf dfs.safemode.min.datanodes determines how many datanodes it should wait to be "alive" before exiting safemode. To test this I updated TestSafeMode, and also moved that test into o.a.h.hdfs.server.namenode so that it could access the package-protected getSafeModeTip() function in FSNamesystem.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12415894/hdfs-528-v2.txt
        against trunk revision 802264.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 8 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/49/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/49/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/49/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/49/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12415894/hdfs-528-v2.txt against trunk revision 802264. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/49/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/49/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/49/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/49/console This message is automatically generated.
        Hide
        Todd Lipcon added a comment -

        The failing tests appear to be related to avro not being on the classpath

        Show
        Todd Lipcon added a comment - The failing tests appear to be related to avro not being on the classpath
        Hide
        Todd Lipcon added a comment -

        Found a bug in the turnoff tip formatting:

        2009-08-10 17:26:58,337 INFO org.apache.hadoop.hdfs.StateChange: STATE* Safe mode extension entered.
        The reported blocks 1 has reached the threshold 0.9990 of total blocks 1. Safe mode will be turned off automatically The number of live datanodes 1 has reached the minimum number 1. in 4 seconds.

        Will upload the fixed patch momentarily.

        Show
        Todd Lipcon added a comment - Found a bug in the turnoff tip formatting: 2009-08-10 17:26:58,337 INFO org.apache.hadoop.hdfs.StateChange: STATE* Safe mode extension entered. The reported blocks 1 has reached the threshold 0.9990 of total blocks 1. Safe mode will be turned off automatically The number of live datanodes 1 has reached the minimum number 1. in 4 seconds. Will upload the fixed patch momentarily.
        Hide
        Todd Lipcon added a comment -

        This updated patch fixes the issue.

        Show
        Todd Lipcon added a comment - This updated patch fixes the issue.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12416143/hdfs-528-v3.txt
        against trunk revision 802972.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 8 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/59/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/59/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/59/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/59/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12416143/hdfs-528-v3.txt against trunk revision 802972. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/59/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/59/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/59/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/59/console This message is automatically generated.
        Hide
        Todd Lipcon added a comment -

        I'm seeing some odd behavior with this patch getting stuck in safe mode in our sandbox environment, so canceling this patch for now. I'll update again once I determine what the issue is.

        Show
        Todd Lipcon added a comment - I'm seeing some odd behavior with this patch getting stuck in safe mode in our sandbox environment, so canceling this patch for now. I'll update again once I determine what the issue is.
        Hide
        Steve Loughran added a comment -

        I do something like this in my workflow; block until the the #of workers is above the required number. for one thing, that number could be >1, for another, 15, for a third 3000. It is also rought the same action regardless of whether what you are waiting for is the NN or the JT; its the number of workers a particular bit of the cluster has to help.

        Rather than extend safemode, could you have something that just blocks off the NN until its worker count is above what your specific script wants? After all, it is your script that has needs, not the NN.

        Show
        Steve Loughran added a comment - I do something like this in my workflow; block until the the #of workers is above the required number. for one thing, that number could be >1, for another, 15, for a third 3000. It is also rought the same action regardless of whether what you are waiting for is the NN or the JT; its the number of workers a particular bit of the cluster has to help. Rather than extend safemode, could you have something that just blocks off the NN until its worker count is above what your specific script wants? After all, it is your script that has needs, not the NN.
        Hide
        Todd Lipcon added a comment -

        Steve: see Dhruba's comment above. In the case of an entirely empty DFS it makes sense that the NN should stay in safe mode until at least one DN shows up. This will help with some newbie errors where the DFS appears to be running but in fact nothing works right because there are no DNs. It also helps with the issue Dhruba mentioned involving unneeded replication during startup, and is more principled than just picking a large number for dfs.safemode.extension.

        I've got the bug I mentioned last night fixed and will upload a new patch after I do a bit of "burn in" in our sandbox.

        Show
        Todd Lipcon added a comment - Steve: see Dhruba's comment above. In the case of an entirely empty DFS it makes sense that the NN should stay in safe mode until at least one DN shows up. This will help with some newbie errors where the DFS appears to be running but in fact nothing works right because there are no DNs. It also helps with the issue Dhruba mentioned involving unneeded replication during startup, and is more principled than just picking a large number for dfs.safemode.extension. I've got the bug I mentioned last night fixed and will upload a new patch after I do a bit of "burn in" in our sandbox.
        Hide
        Todd Lipcon added a comment -

        Updated patch against trunk

        Show
        Todd Lipcon added a comment - Updated patch against trunk
        Hide
        Todd Lipcon added a comment -

        This probably needs additional documentation in the HDFS user guide. But running through Hudson to check the patch.

        Show
        Todd Lipcon added a comment - This probably needs additional documentation in the HDFS user guide. But running through Hudson to check the patch.
        Hide
        Todd Lipcon added a comment -

        somehow hudson missed this in the patch queue... retriggering.

        Show
        Todd Lipcon added a comment - somehow hudson missed this in the patch queue... retriggering.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12434623/hdfs-528.txt
        against trunk revision 905873.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 8 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/220/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/220/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/220/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/220/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12434623/hdfs-528.txt against trunk revision 905873. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/220/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/220/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/220/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/220/console This message is automatically generated.
        Hide
        dhruba borthakur added a comment -

        It appears that this functionality can be achieved by some code outside the namenode.

        1. start the NN with a dfs.safemode.threshold.pct 1.5 i.e. NN will never exit safemode by itself.
        2. write a script that periodically invokes "bin/hadoop dfsadmin -report" and counts the number of datanodes that have checked in with the NN.
        3. The script can explicitly exit safemode whenever it desires.

        This approach allows different policies of when-to-exit-safemode be implemented outside the NN.

        If you agree, then we can make this JIRA expose a new API from the NN that exposes the safeBlockCount and totalBlockCount from the NN.

        Show
        dhruba borthakur added a comment - It appears that this functionality can be achieved by some code outside the namenode. 1. start the NN with a dfs.safemode.threshold.pct 1.5 i.e. NN will never exit safemode by itself. 2. write a script that periodically invokes "bin/hadoop dfsadmin -report" and counts the number of datanodes that have checked in with the NN. 3. The script can explicitly exit safemode whenever it desires. This approach allows different policies of when-to-exit-safemode be implemented outside the NN. If you agree, then we can make this JIRA expose a new API from the NN that exposes the safeBlockCount and totalBlockCount from the NN.
        Hide
        Todd Lipcon added a comment -

        I agree, but I don't think we should require operators to implement all of these special tools. The feature here is a pretty common request, I think. It's especially helpful for new users whose DN is broken - much better to have them get "SafemodeException: 0 datanodes reported" kind of thing compared to the bizarre looking "Couldn't find any datanode for block" kind of errors they get now (if we were to default this to 1).

        Show
        Todd Lipcon added a comment - I agree, but I don't think we should require operators to implement all of these special tools. The feature here is a pretty common request, I think. It's especially helpful for new users whose DN is broken - much better to have them get "SafemodeException: 0 datanodes reported" kind of thing compared to the bizarre looking "Couldn't find any datanode for block" kind of errors they get now (if we were to default this to 1).
        Hide
        dhruba borthakur added a comment -

        If it is a question of "reporting error to the user", how about if we change the error message:

        if (#nodes in the cluster == 0)
        "There are no datanodes in the entire cluster"
        else
        "Couldn't find any datanode for block"

        Show
        dhruba borthakur added a comment - If it is a question of "reporting error to the user", how about if we change the error message: if (#nodes in the cluster == 0) "There are no datanodes in the entire cluster" else "Couldn't find any datanode for block"
        Hide
        Todd Lipcon added a comment -

        Hi Dhruba,

        It seems you agreed with the original premise of the issue, for the reason of avoiding replication storms early in NN startup. Certainly you can do this with an external tool, force it to always start in safemode, and manually (through the tool) kicking it out of safemode when you're ready. But why not let Hadoop do this for us?

        The reason of the error message I see as a bonus on top of a feature which is generally useful, and which I dont think we should force every operator to write when it's so simple to integrate into the existing feature. FWIW this code has been shipping with CDH for 8 months with no issues.

        Show
        Todd Lipcon added a comment - Hi Dhruba, It seems you agreed with the original premise of the issue, for the reason of avoiding replication storms early in NN startup. Certainly you can do this with an external tool, force it to always start in safemode, and manually (through the tool) kicking it out of safemode when you're ready. But why not let Hadoop do this for us? The reason of the error message I see as a bonus on top of a feature which is generally useful, and which I dont think we should force every operator to write when it's so simple to integrate into the existing feature. FWIW this code has been shipping with CDH for 8 months with no issues.
        Hide
        Konstantin Shvachko added a comment -

        SafeMode was designed to be agnostic of the data-node body on the cluster. There is no way for the NN to know how many data-nodes will report.
        I like Dhriba's idea of counting nodes in dfsadmin -report.
        Another way to verify that "DFS is writable" is to call getStats() and check that the number of under-replicated and missing blocks is 0.
        I don't mind introducing a dfsadmin command for that, we already have similar things in MiniDFSCluster.
        But I don't think this logic should be incorporated in SafeMode.

        Show
        Konstantin Shvachko added a comment - SafeMode was designed to be agnostic of the data-node body on the cluster. There is no way for the NN to know how many data-nodes will report. I like Dhriba's idea of counting nodes in dfsadmin -report . Another way to verify that "DFS is writable" is to call getStats() and check that the number of under-replicated and missing blocks is 0. I don't mind introducing a dfsadmin command for that, we already have similar things in MiniDFSCluster. But I don't think this logic should be incorporated in SafeMode.
        Hide
        Eli Collins added a comment -

        Why not just add dfsadmin command (eg -waitwritable) that returns when the file system is writable? Seems like # datanodes is just a proxy for what the user really wants to know.

        Show
        Eli Collins added a comment - Why not just add dfsadmin command (eg -waitwritable) that returns when the file system is writable? Seems like # datanodes is just a proxy for what the user really wants to know.
        Hide
        Eli Collins added a comment -

        The config option (dfs.namenode.replqueue.threshold-pct config) introduced in HDFS-1476 should address Dhruba's use case, though doesn't tell you whether the FS is writable.

        Show
        Eli Collins added a comment - The config option (dfs.namenode.replqueue.threshold-pct config) introduced in HDFS-1476 should address Dhruba's use case, though doesn't tell you whether the FS is writable.
        Hide
        dhruba borthakur added a comment -

        I am +1 on the config option proposed by Todd. I think it will help some newbie administrators in administering their cluster.

        On a related note, I would like to make the logic that makes the NN exist safemode be pluggable. This is helpful when files are raided using HDFS RAID. I opened HDFS-1501.

        Show
        dhruba borthakur added a comment - I am +1 on the config option proposed by Todd. I think it will help some newbie administrators in administering their cluster. On a related note, I would like to make the logic that makes the NN exist safemode be pluggable. This is helpful when files are raided using HDFS RAID. I opened HDFS-1501 .
        Hide
        Eli Collins added a comment -

        +1

        We've found it to be a useful option. Agree that the logic should be pluggable.

        Show
        Eli Collins added a comment - +1 We've found it to be a useful option. Agree that the logic should be pluggable.
        Hide
        Eli Collins added a comment -

        Patch attached. Merges with trunk. Code looks good to me. I don't see any additional places beyond registerDatanode and removeDatanode that checkMode needs to be called. Updated TestSafeMode to use the new MiniDFSCluster.Builder. I diff'd TestSafeMode that this patch deletes and TestSafeMode that is currently on trunk to make sure we didn't miss anything that was added since the last patch was created.

        Show
        Eli Collins added a comment - Patch attached. Merges with trunk. Code looks good to me. I don't see any additional places beyond registerDatanode and removeDatanode that checkMode needs to be called. Updated TestSafeMode to use the new MiniDFSCluster.Builder. I diff'd TestSafeMode that this patch deletes and TestSafeMode that is currently on trunk to make sure we didn't miss anything that was added since the last patch was created.
        Hide
        Eli Collins added a comment -

        Forgot to ask, Konstantin, do you object to the latest patch? I didn't completely follow your earlier comment since safe mode is very tied to whether datanodes have checked in (albeit at the block level). This option isn't cognizant of the total number, it just specifies a lower bound so that the NN doesn't leave safemode prematurely (eg before no datanodes check in).

        Show
        Eli Collins added a comment - Forgot to ask, Konstantin, do you object to the latest patch? I didn't completely follow your earlier comment since safe mode is very tied to whether datanodes have checked in (albeit at the block level). This option isn't cognizant of the total number, it just specifies a lower bound so that the NN doesn't leave safemode prematurely (eg before no datanodes check in).
        Hide
        Eli Collins added a comment -

        Results on the latest patch. Running test-core.

             [exec] 
             [exec] -1 overall.  
             [exec] 
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec] 
             [exec]     +1 tests included.  The patch appears to include 8 new or modified tests.
             [exec] 
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec] 
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec] 
             [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
             [exec] 
             [exec]     -1 release audit.  The applied patch generated 103 release audit warnings (more than the trunk's current 1 warnings).
             [exec] 
             [exec]     +1 system test framework.  The patch passed system test framework compile.
             [exec] 
        
        Show
        Eli Collins added a comment - Results on the latest patch. Running test-core. [exec] [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 8 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] -1 release audit. The applied patch generated 103 release audit warnings (more than the trunk's current 1 warnings). [exec] [exec] +1 system test framework. The patch passed system test framework compile. [exec]
        Hide
        Eli Collins added a comment -

        If no one objects to the latest patch I'm going to commit it.

        Show
        Eli Collins added a comment - If no one objects to the latest patch I'm going to commit it.
        Hide
        dhruba borthakur added a comment -

        > If no one objects to the latest patch I'm going to commit it.

        +1, code looks good to me.

        Show
        dhruba borthakur added a comment - > If no one objects to the latest patch I'm going to commit it. +1, code looks good to me.
        Hide
        Eli Collins added a comment -

        Thanks Dhruba. I've committed this.

        Show
        Eli Collins added a comment - Thanks Dhruba. I've committed this.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        h528_20120731_b-1.patch: for branch-1.

        Show
        Tsz Wo Nicholas Sze added a comment - h528_20120731_b-1.patch: for branch-1.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        In the original patch (hdfs-528-v4.txt), it moved TestSafeMode from org.apache.hadoop.hdfs to org.apache.hadoop.hdfs.server.namenode. However, the test was moved back by HDFS-2817 "Combine the two TestSafeMode test suites". Thus I did not move the test in the branch-1 patch.

        Show
        Tsz Wo Nicholas Sze added a comment - In the original patch (hdfs-528-v4.txt), it moved TestSafeMode from org.apache.hadoop.hdfs to org.apache.hadoop.hdfs.server.namenode. However, the test was moved back by HDFS-2817 "Combine the two TestSafeMode test suites". Thus I did not move the test in the branch-1 patch.
        Hide
        Suresh Srinivas added a comment -

        +1 for the ported branch-1 patch.

        Quick question. In the original patch, I am not clear about the intent of calling safeMode.checkMode() when removing datanode. When is safeMode not null and is the expectation that namenode enters safemode again? Should there be a test for that?

        Show
        Suresh Srinivas added a comment - +1 for the ported branch-1 patch. Quick question. In the original patch, I am not clear about the intent of calling safeMode.checkMode() when removing datanode. When is safeMode not null and is the expectation that namenode enters safemode again? Should there be a test for that?
        Hide
        Tsz Wo Nicholas Sze added a comment -

        > ... In the original patch, I am not clear about the intent of calling safeMode.checkMode() when removing datanode. ...

        Here one useful case: When safemode is in extension, then removing a datanode will change it back to normal safemode.

        Show
        Tsz Wo Nicholas Sze added a comment - > ... In the original patch, I am not clear about the intent of calling safeMode.checkMode() when removing datanode. ... Here one useful case: When safemode is in extension, then removing a datanode will change it back to normal safemode.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        I have committed the branch-1 patch.

        Show
        Tsz Wo Nicholas Sze added a comment - I have committed the branch-1 patch.
        Hide
        Matt Foley added a comment -

        Merged to branch-1.1.

        Show
        Matt Foley added a comment - Merged to branch-1.1.

          People

          • Assignee:
            Todd Lipcon
            Reporter:
            Todd Lipcon
          • Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development