|
[
Permlink
| « Hide
]
dhruba borthakur added a comment - 28/May/08 01:01 AM
One simple approach is for the client to look for the string "CORRUPT" and then exit with either 0 or 1.
Yes, that's my current work around, but was hoping to have fsck itself include that.
Attached patch scans for the string "is CORRUPT". Have updated the test cases, so that tomorrow in someone changes the output, tests reflect them.
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12382904/HADOOP-3452-1.patch against trunk revision 659405. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2518/testReport/ This message is automatically generated. Scanning for strings is very, very brittle, and not a good way to do system management. The fact that it is the only way to verify that rpm and yum are working are design errors on the part of those tools, not an example of good programming to follow.
replace "Status: " + (isHealthy() ? "HEALTHY" : "CORRUPT")) The client could then look for NameNodeFsck.STATUS_HEALTHY, a string that would (currently) be "Status: HEALTHY" when all was well. -it would be better for the servlet response to include an HTTP error code (500?) on corruption, though that leads to two problems Looking at NameNodeFsck, I think it would be good to remove all knowledge of the fact it is running under a servlet from its code -that is "view", not "model. This would give us the option of using alternate ways to invoke filesystem checks than http requests. I like the idea of using as compile time constant NameNodeFsck.STATUS_HEALTHY variable to determine that return status of fsck. Also, it would be nice to not depend on Http error error codes.
Do you have another mechanism in mind (other than using a servlet) to run fsck? -Error codes could be something added as an option. They are reliable, except that proxy servers like to interfere.
-moving to constants should be easy. It would be extra impressive if the result was hidden in a <span> with a specific name, so you could XPATH/regexp on the response and look for <span class="status" >SUCCESS</span> through some Xpath like /span[@class="status"] >Do you have another mechanism in mind (other than using a servlet) to run fsck? Possibly. In SVN we have the code to bring up any of the name/data/tracker/jobtracker nodes and execute operations on the filesystem, though many more tests are needed. adding an FSCK check component would round things off. Right now, I'd consider generating an httpclient request with all the parameters and do a regexp search on the result. Thanks Steve, Dhruba for looking into this. If we consider the current implmentation, fsck was implemented as servlet so that we do not use RPC. The client opens up URL connection and reads the output from FSCK servlet. We might not be able to set new headers/response code at the end of fsck executiong since the fsck in itself take a while to complete. And the status is not know prior to executing fsck. Could you brief me as to how we could get the response other than scanning this stream?
I am attaching a patch with comments about this for now. Have another small line change for flushing output. This is a temporary solution, until we plan a different approach to do this.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12383268/HADOOP-3452-2.patch against trunk revision 662634. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2544/testReport/ This message is automatically generated. -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12383268/HADOOP-3452-2.patch against trunk revision 662913. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2552/console This message is automatically generated. looks like hudson had problem, the failure is not related to the patch. Re-submitting.
+1 code looks good. Is there a reason why we match the string "is CORRUPT" and not just "CORRUPT"? If we are doing this on purpose, then the comment in the DFSck.java should mention that "is CORRUPT" is the precise string that is being used.
Thanks for reviewing this Dhruba. The reason I scan for "is CORRUPT" is because, fsck now also reports CORRUPT blocks. And also, if a filename contains CORRUPT, we do not want to report filesystem to be corrupt. As I mentioned, this is a temporary fix. I have updated the patch comments as you suggested.
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12383356/HADOOP-3452-3.patch against trunk revision 663079. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2570/testReport/ This message is automatically generated. Assigning to whomever submitted a patch so as to better manage committing things that are ready for prime time.
I just committed this. Thanks, Lohit
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||