|
[
Permlink
| « Hide
]
Tsz Wo (Nicholas), SZE added a comment - 19/Sep/08 08:50 PM
Forgot to attach a file?
sorry, I did not realized that readLine moved to LineReader in util, deleted old patch, submitted new.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12390539/LineReader.java.patch against trunk revision 697306. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3338/console This message is automatically generated. another snafu, sorry, reuploaded the patch
What do I need to do to make this patch available (for Hudson to test)? I've marked it "in Progress", re-attached the patch. Now I don't even have "Submit patch", the only available workflow action is "resolve issue". This web interface is a little srewy.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12390601/HADOOP-4226.patch against trunk revision 697306. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3344/testReport/ This message is automatically generated. I think that failed test wasn't related to my changes at all, probably diffed against failing snapshot. Here it is again with minor adjustments
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12390667/HADOOP-4226.patch against trunk revision 697306. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3345/testReport/ This message is automatically generated. OK, let me respond to Hudson:
1. Tests: since this is not new code, I don't feel obligated to provide more tests. There already exist tests for TextInputFormat that are already checking this code; these tests pass. 2. Two failed tests: I've re-run these tests on my box, they pass (except TestLeaseRecovery2 throws exception closing an already closed file, but it still passes) on my system. What's up with that? If you could point me to a "clean" revision, I'll be happy to generate a patch against that. Otherwise I think we're OK here. I might make one more little change to the code to make it a little cleaner, but otherwise I think the patch is ready. I urge commiters to look at it and let me know what they think. I'd hate my time to be wasted, so please comment.
If you're going to change the code again, a review of the final version is the best use of everyone's time. Also: it's only been two days. This is a heavily used piece of code with a lot of corner cases, so this refactoring is risky, low-priority, and non-trivial to review. Please try to be patient.
TestTextInputFormat is testing LineRecordReader.LineReader, not o.a.h.util.LineReader, to which this patch applies.
Thanks for your input.
OK. BTW, are you planning to switch LRR to use util.LineReader?
Only as an optimization; for this little piece of code I think it's OK. It is (or going to be) heavily used, as you say.
Sorry, fixed. It was not obvious from looking at the code.
Done. Beating the hell out of it.
nod Fair enough. I think it makes the code slightly harder to read for a nominal performance gain that the JIT compiler should cover, but it's not significant.
It's not obvious that it should work that way at all, but backwards compatibility is a big deal for this class.
Patched against current trunk.
I've modified tests in TestInputFormat to do additional beating on readLine. One of the tests validates the \r\r sequence. Buffer size varies and at least a couple of times we should have a read in between of the two \r's. I've fixed all other nits that you commented on.
It probably should, as it turns out \r is a newline on older macs and commodore, I think. forgot to change DEFAULT_BUFFER_SIZE to private
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12391039/HADOOP-4226.patch against trunk revision 699517. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3384/testReport/ This message is automatically generated.
With a little more documentation, I think this is good to commit.
Sorry about that: started deleting inconsequential patches and got carried away.
I'm not a java programmer, so I could use some advice here on where would such summary go: javadoc comments (/*...*/) or the policy is to keep them short and the longer comments should go somewhere else?
If people were to re-use this class with different newline separators, they might find it useful to be able to change CR and LF. Maybe they should not be static and be public, so they can simply be assigned to?
Though we have some standards
It's pretty specialized code... I'd much rather lock it down until we have a use case. Having non-final, public static fields is generally considered bad practice. It makes the code unverifiable in isolation and admits some very difficult bugs (and races if two classes assign different values to CR and LF). added comments, CR,LF are private final static.
+1 Looks good
I just committed this. Thanks, Yuri Integrated in Hadoop-trunk #620 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/620/
. Refactor and document LineReader to make it more readily understandable. Contributed by Yuri Pradkin. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||