|
[
Permlink
| « Hide
]
Enis Soztutar added a comment - 16/Nov/07 10:06 AM
Bringing up this old issue, I am really uncomfortable with the ids handled as strings. The attached patch is a proposal for the new XXXId classes that we may use. I intend to continue the progress once we have grounded the need and the design for this issue. Any thoughts ?
This 5k+ patch introduces three new classes TaskId, JobId and TaskInProgressId, and changes every occurrence of th e string counterparts to use objects.
local tests pass with some unstability for contrib. Lets see what the hudson say.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12371240/id_v1.patch against trunk revision r602002. @author +1. The patch does not contain any @author tags. javadoc -1. The javadoc tool appears to have generated messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs -1. The patch appears to introduce 5 new Findbugs warnings. core tests -1. The patch failed core unit tests. contrib tests -1. The patch failed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1296/testReport/ This message is automatically generated. Some comments:
This patch addresses Doug's comments.
Changes include :
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12371871/id_v2.patch against trunk revision r605220. @author +1. The patch does not contain any @author tags. javadoc -1. The javadoc tool appears to have generated messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs -1. The patch appears to introduce 1 new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1382/testReport/ This message is automatically generated. Patch with fixed javadoc warnings and findbugs warnings.
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12371930/id_v3.patch against trunk revision r605457. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1392/testReport/ This message is automatically generated. Patch updated to meet current trunk. The functions are consistently named as
get{Job|Task|TaskInProgress}ID() with some minor exceptions, where the change would cause more trouble than the benefits. -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12372507/id_v4.patch against trunk revision . @author +1. The patch does not contain any @author tags. javadoc -1. The javadoc tool appears to have generated messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs -1. The patch appears to cause Findbugs to fail. core tests -1. The patch failed core unit tests. contrib tests -1. The patch failed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1483/testReport/ This message is automatically generated. Retrying hudson tests, which failed due to HADOOP-2240.
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12372507/id_v4.patch against trunk revision . @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1491/testReport/ This message is automatically generated. Feature freeze is soon, and we do not want this huge patch to cause trouble. Moving this to be fixed at 0.17.
I could not find time to update the patch, moving this to 0.18.
Enis, could you please update the patch. This seems like a good time to get it in. Thanks!
I think so! I'll update the patch as soon as I can.
Patch updated to trunk, No major changes from the previous one.
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12380681/id_v5.patch against trunk revision 645773. @author +1. The patch does not contain any @author tags. tests included +1. The patch appears to include 12 new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2301/testReport/ This message is automatically generated. Committers, I kindly request that the patch is reviewed ASAP, so that we can include this with minumum overhead. Thanks, smile.
I'd propose that we change the names of the id classes toward a less confusing system:
JobId I think it is better not to import each of the values of enumerations, so that you always have the type name associated with the value. Thanks for all of the hard work on this Enis! It also changes the signature on some public methods without deprecation. Is there a way to detect those?
here is a tar ball generated by jdiff that lists the api changes by this patch. unfortunately, it really needs to be fixed or we will break a bunch of applications.
Owen, thanks for the jdiff, I haven't though of that.
I checked the differences in the public APIs :
The name sets {jobid, taskinprogressid, taskid} and {jobid, taskid, taskattemptid} is already confusing (sometimes even I get confused. ) I think the ultimate naming should be the latter, which makes more sense. However we have been using and exposing both naming schema. For example the task names start with tip_ or task_, we have JobClient#killTask(taskid) which accepts a task attempt id, and such, on the other hand JobHistory uses taskID-taskAttemptID, so the current situation is already inconsistent. Enis, I agree with Owen - I think it would be great if this patch can take the first step and turn the tide towards the right nomenclature. Thanks again for your patience...
+1 for Enis's take on changing public classes which aren't really part of the Map-Reduce's User Interface... we can change {Map|Reduce}TaskStatus etc. without any harm/fear.
I ran Sort benchmark on 500 nodes with the patch ( id_v5.patch) and see these results:
0.17.0 build: 85.3 mins Which matches what I've seen in the profiler on the JobTracker, which is that we spend a lot of time looking up maps from string to X, for lots of different X's.
A 5% improvement, wow !
Thanks to the help from Arun, I've updated the patch. This version is a first attempt to build the naming convention. As a side note, we keep tipId and taskId variable names, in some places. It will be good if they are also refactored, but I prefer to delay this in a later issue. Thanks for the benchmark, and the reviews. -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12381106/id_v6.patch against trunk revision 645773. @author +1. The patch does not contain any @author tags. tests included +1. The patch appears to include 12 new or modified tests. patch -1. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2343/console This message is automatically generated. I removed the spurious changes to the TaskRunner import statements that were conflicting.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12381151/id_v7.patch against trunk revision 645773. @author +1. The patch does not contain any @author tags. tests included +1. The patch appears to include 12 new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests -1. The patch failed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2346/testReport/ This message is automatically generated. Resubmitting the patch. The test failures had something to do with MapReduce
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12381151/id_v7.patch against trunk revision 645773. @author +1. The patch does not contain any @author tags. tests included +1. The patch appears to include 12 new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests -1. The patch failed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2349/testReport/ This message is automatically generated. Updated version, introduces static xxxID.getxxxIDsPattern(...) methods to obtain regex patterns which match ID strings. Patch fixes the failed tests, which hand-crafted these patterns.
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12381173/id_v8.patch against trunk revision 645773. @author +1. The patch does not contain any @author tags. tests included +1. The patch appears to include 15 new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2351/testReport/ This message is automatically generated. I just commited this. Thanks, Enis!
Changed the names of the classes in the release note, I will also fix and recommit CHANGES.txt.
Integrated in Hadoop-trunk #476 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/476/
For all the discussion about deprectatoin, etc, the patch for this issue has broken trunk! It no longer compiles on Hudson. The contrib eclipse plugin relied on an API that you considered non-Public (JobStatus and perhaps others). Given we have no well defined public API yet, I don't think you can assume that classes declared "public" aren't actually public. At a minimum, the release notes should indicated that methods were removed and not deprecated.
Build failure:
Can we get this fixed today? I opened
Integrated in Hadoop-trunk #479 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/479/
Integrated in Hadoop-trunk #581 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/581/
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||