Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: documentation
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      There are lots of implementation classes in org.apache.hadoop.mapred which makes it difficult to see the user-level MapReduce API classes in the Javadoc. (See http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/package-summary.html for example.) By marking these implementation classes with the InterfaceAudience.Private annotation we can exclude them from user Javadoc (using HADOOP-6658).

      Later work will move the implementation classes into o.a.h.mapreduce.server and related packages (see MAPREDUCE-561), but applying the annotations is a good first step.

      1. M1623-1.patch
        124 kB
        Chris Douglas
      2. MAPREDUCE-1623.patch
        348 kB
        Arun C Murthy
      3. MAPREDUCE-1623.patch
        348 kB
        Tom White
      4. MAPREDUCE-1623.patch
        348 kB
        Tom White
      5. MAPREDUCE-1623.patch
        143 kB
        Tom White
      6. MAPREDUCE-1623.patch
        142 kB
        Tom White
      7. MAPREDUCE-1623.patch
        126 kB
        Tom White
      8. MAPREDUCE-1623.patch
        220 kB
        Tom White
      9. MAPREDUCE-1623.patch
        113 kB
        Tom White
      10. MAPREDUCE-1623.patch
        113 kB
        Tom White
      11. MAPREDUCE-1623.patch
        37 kB
        Tom White
      12. MAPREDUCE-1623.patch
        31 kB
        Tom White
      13. MAPREDUCE-1623.patch
        25 kB
        Tom White

        Issue Links

          Activity

          Hide
          Tom White added a comment -

          A patch that marks internal MapReduce implementation classes in the mapred package as Private Unstable, and protocol interfaces as Private Stable. Feedback welcome.

          Show
          Tom White added a comment - A patch that marks internal MapReduce implementation classes in the mapred package as Private Unstable, and protocol interfaces as Private Stable. Feedback welcome.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12439732/MAPREDUCE-1623.patch
          against trunk revision 927187.

          +1 @author. The patch does not contain any @author tags.

          +0 tests included. The patch appears to be a documentation patch that doesn't require tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/59/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/59/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/59/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/59/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12439732/MAPREDUCE-1623.patch against trunk revision 927187. +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/59/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/59/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/59/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/59/console This message is automatically generated.
          Hide
          Tom White added a comment -

          New patch which adds some more annotations, and includes a few package-info.java files with the @Private annotation for whole packages which are private (these will be excluded from javadoc and JDiff once MAPREDUCE-1650 is done).

          Show
          Tom White added a comment - New patch which adds some more annotations, and includes a few package-info.java files with the @Private annotation for whole packages which are private (these will be excluded from javadoc and JDiff once MAPREDUCE-1650 is done).
          Hide
          Amareshwari Sriramadasu added a comment -

          After going through the package-summary and the attached patch, I have some questions/comments.

          1. I think org.apache.hadoop.mapred.IsolationRunner is public. Shall we add an unstable annotation for the same?
          2. Is org.apache.hadoop.mapred.SkipBadRecords private or public? We should add the appropriate annotation.
          3. I think org.apache.hadoop.mapred.TaskLogAppender is private. Shall we add a private annotation for the same?
          4. Shall we annotate org.apache.hadoop.mapred.Utils.java as private ?
          Show
          Amareshwari Sriramadasu added a comment - After going through the package-summary and the attached patch, I have some questions/comments. I think org.apache.hadoop.mapred.IsolationRunner is public. Shall we add an unstable annotation for the same? Is org.apache.hadoop.mapred.SkipBadRecords private or public? We should add the appropriate annotation. I think org.apache.hadoop.mapred.TaskLogAppender is private. Shall we add a private annotation for the same? Shall we annotate org.apache.hadoop.mapred.Utils.java as private ?
          Hide
          Tom White added a comment -

          Thanks for reviewing the changes, Amareshwari.

          > I think org.apache.hadoop.mapred.IsolationRunner is public. Shall we add an unstable annotation for the same?

          Added.

          > Is org.apache.hadoop.mapred.SkipBadRecords private or public? We should add the appropriate annotation.

          I think this is a public API. I notice that it is not deprecated since there is no replacement in the new API. Is this an oversight that needs addressing?

          > I think org.apache.hadoop.mapred.TaskLogAppender is private. Shall we add a private annotation for the same?

          I wasn't sure about this one. I left it public since it is specified in log4j.properties, so to that extent it is publicly exposed. I've added an Unstable annotation to indicate that its name/interface may change. Does this sound reasonable?

          > Shall we annotate org.apache.hadoop.mapred.Utils.java as private ?

          I think these are intended to be public, no? OutputLogFilter is deprecated in favour of mapred.Utils.

          I've also added licenses to the new files.

          Show
          Tom White added a comment - Thanks for reviewing the changes, Amareshwari. > I think org.apache.hadoop.mapred.IsolationRunner is public. Shall we add an unstable annotation for the same? Added. > Is org.apache.hadoop.mapred.SkipBadRecords private or public? We should add the appropriate annotation. I think this is a public API. I notice that it is not deprecated since there is no replacement in the new API. Is this an oversight that needs addressing? > I think org.apache.hadoop.mapred.TaskLogAppender is private. Shall we add a private annotation for the same? I wasn't sure about this one. I left it public since it is specified in log4j.properties, so to that extent it is publicly exposed. I've added an Unstable annotation to indicate that its name/interface may change. Does this sound reasonable? > Shall we annotate org.apache.hadoop.mapred.Utils.java as private ? I think these are intended to be public, no? OutputLogFilter is deprecated in favour of mapred.Utils. I've also added licenses to the new files.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12440365/MAPREDUCE-1623.patch
          against trunk revision 929712.

          +1 @author. The patch does not contain any @author tags.

          +0 tests included. The patch appears to be a documentation patch that doesn't require tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/80/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/80/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/80/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/80/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12440365/MAPREDUCE-1623.patch against trunk revision 929712. +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/80/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/80/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/80/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/80/console This message is automatically generated.
          Hide
          Tom White added a comment -

          Here's a new patch which covers all the core packages in MapReduce. The details are in the patch, but the following table gives a summary:

          Package org.apache.hadoop. Visibility & Stability Notes
          filecache deprecated  
          mapred deprecated or private unstable (for implementation classes)  
          mapred.lib deprecated  
          mapred.pipes public stable  
          mapred.tools public stable  
          mapreduce public evolving  
          mapreduce.filecache deprecated/private  
          mapreduce.jobhistory private unstable  
          mapreduce.protocol private stable  
          mapreduce.security private unstable/public evolving  
          mapreduce.security.token* private unstable  
          mapreduce.server.jobtracker private unstable  
          mapreduce.server.tasktracker private unstable  
          mapreduce.split private unstable  
          mapreduce.task* private unstable  
          mapreduce.tools public stable  
          mapreduce.util private unstable All the util classes are for the framework, so shouldn't be public
          util deprecated  

          With this patch, MAPREDUCE-1650, and MAPREDUCE-1625, the public Javadoc looks like this: http://people.apache.org/~tomwhite/MAPREDUCE-1623/api/

          Show
          Tom White added a comment - Here's a new patch which covers all the core packages in MapReduce. The details are in the patch, but the following table gives a summary: Package org.apache.hadoop. Visibility & Stability Notes filecache deprecated   mapred deprecated or private unstable (for implementation classes)   mapred.lib deprecated   mapred.pipes public stable   mapred.tools public stable   mapreduce public evolving   mapreduce.filecache deprecated/private   mapreduce.jobhistory private unstable   mapreduce.protocol private stable   mapreduce.security private unstable/public evolving   mapreduce.security.token* private unstable   mapreduce.server.jobtracker private unstable   mapreduce.server.tasktracker private unstable   mapreduce.split private unstable   mapreduce.task* private unstable   mapreduce.tools public stable   mapreduce.util private unstable All the util classes are for the framework, so shouldn't be public util deprecated   With this patch, MAPREDUCE-1650 , and MAPREDUCE-1625 , the public Javadoc looks like this: http://people.apache.org/~tomwhite/MAPREDUCE-1623/api/
          Hide
          Tom White added a comment -

          The classes in the mapreduce.lib packages should be marked public evolving too.

          Show
          Tom White added a comment - The classes in the mapreduce.lib packages should be marked public evolving too.
          Hide
          Chris Douglas added a comment -
          • The contents of the jobhistory package can probably be public evolving.
          • Should the o.a.h.util package also be marked as Private?
          • Should the mapreduce.filecache package also be marked as Private, if javadoc were returned to o.a.h.filecache.DistributedCache?
          • The warnings in javadoc are redundant with the annotations, particulary given MAPREDUCE-1650 ("This class is internal only and not intended for users!!", <FRAMEWORK-USE-ONLY>, "This is NOT a public interface!", etc.). Attached an updated patch removing them.

          Thanks, Tom.

          Show
          Chris Douglas added a comment - The contents of the jobhistory package can probably be public evolving. Should the o.a.h.util package also be marked as Private? Should the mapreduce.filecache package also be marked as Private, if javadoc were returned to o.a.h.filecache.DistributedCache ? The warnings in javadoc are redundant with the annotations, particulary given MAPREDUCE-1650 ("This class is internal only and not intended for users!!", <FRAMEWORK-USE-ONLY> , "This is NOT a public interface!", etc.). Attached an updated patch removing them. Thanks, Tom.
          Hide
          Amareshwari Sriramadasu added a comment -

          The contents of the jobhistory package can probably be public evolving.

          Not really. All the classes in jobhistory package are not public, for example: JobHistory.java. I think this package should be private and the necessary client side functionality should be available to public from org.apache.hadoop.mapreduce package (does this need a separate jira?).

          Should the o.a.h.util package also be marked as Private? Should the mapreduce.filecache package also be marked as Private, if javadoc were returned to o.a.h.filecache.DistributedCache?

          I agree both o.a.h.util and mapreduce.filecache packages are private.

          Show
          Amareshwari Sriramadasu added a comment - The contents of the jobhistory package can probably be public evolving. Not really. All the classes in jobhistory package are not public, for example: JobHistory.java. I think this package should be private and the necessary client side functionality should be available to public from org.apache.hadoop.mapreduce package (does this need a separate jira?). Should the o.a.h.util package also be marked as Private? Should the mapreduce.filecache package also be marked as Private, if javadoc were returned to o.a.h.filecache.DistributedCache? I agree both o.a.h.util and mapreduce.filecache packages are private.
          Hide
          Chris Douglas added a comment -

          Not really. All the classes in jobhistory package are not public, for example: JobHistory.java. I think this package should be private and the necessary client side functionality should be available to public from org.apache.hadoop.mapreduce package (does this need a separate jira?).

          If users want to write tools processing the job history, couldn't they use the classes in this package?

          Show
          Chris Douglas added a comment - Not really. All the classes in jobhistory package are not public, for example: JobHistory.java. I think this package should be private and the necessary client side functionality should be available to public from org.apache.hadoop.mapreduce package (does this need a separate jira?). If users want to write tools processing the job history, couldn't they use the classes in this package?
          Hide
          Tom White added a comment -

          Thanks for the review, Chris.

          * The contents of the jobhistory package can probably be public evolving.

          Agree with Amareshwari that not all of the package is public, and that we probably need a public API to it, but it's not clear what it is yet. By marking as public evolving we can evolve it to make things private that should be private. The other way works too - mark it private unstable or evolving and make public the parts that need to be made public in a later release. I can go either way.

          * Should the o.a.h.util package also be marked as Private?

          Yes, given that the classes moved to o.a.h.mapreduce.util, which is private.

          * Should the mapreduce.filecache package also be marked as Private, if javadoc were returned to o.a.h.filecache.DistributedCache?

          I'm a bit confused about what happened here. In 0.20 there was only o.a.h.filecache.DistributedCache, which was actually in Core. It's been deprecated in trunk in favour of o.a.h.filecache.mapreduce.DistributedCache, but this has been deprecated too - in favour of methods on Job. So, yes, mapreduce.filecache package should be Private, and the deprecation message on o.a.h.filecache.DistributedCache should direct users to Job.

          * The warnings in javadoc are redundant with the annotations, particulary given MAPREDUCE-1650 ("This class is internal only and not intended for users!!", <FRAMEWORK-USE-ONLY>, "This is NOT a public interface!", etc.). Attached an updated patch removing them.

          Thanks for updating the patch with this. Is this the only change you made? (It's hard to see with such a big patch.)

          I'll create a new patch with the above changes.

          Show
          Tom White added a comment - Thanks for the review, Chris. * The contents of the jobhistory package can probably be public evolving. Agree with Amareshwari that not all of the package is public, and that we probably need a public API to it, but it's not clear what it is yet. By marking as public evolving we can evolve it to make things private that should be private. The other way works too - mark it private unstable or evolving and make public the parts that need to be made public in a later release. I can go either way. * Should the o.a.h.util package also be marked as Private? Yes, given that the classes moved to o.a.h.mapreduce.util , which is private. * Should the mapreduce.filecache package also be marked as Private, if javadoc were returned to o.a.h.filecache.DistributedCache ? I'm a bit confused about what happened here. In 0.20 there was only o.a.h.filecache.DistributedCache , which was actually in Core. It's been deprecated in trunk in favour of o.a.h.filecache.mapreduce.DistributedCache , but this has been deprecated too - in favour of methods on Job . So, yes, mapreduce.filecache package should be Private, and the deprecation message on o.a.h.filecache.DistributedCache should direct users to Job . * The warnings in javadoc are redundant with the annotations, particulary given MAPREDUCE-1650 ("This class is internal only and not intended for users!!", <FRAMEWORK-USE-ONLY> , "This is NOT a public interface!", etc.). Attached an updated patch removing them. Thanks for updating the patch with this. Is this the only change you made? (It's hard to see with such a big patch.) I'll create a new patch with the above changes.
          Hide
          Amareshwari Sriramadasu added a comment -

          Chris, I wanted to say that the whole package is not public, though some of the classes in the package are public. For example, JobHistory.java has framework code, which creates/deletes job history files, manages history files and etc.

          Show
          Amareshwari Sriramadasu added a comment - Chris, I wanted to say that the whole package is not public, though some of the classes in the package are public. For example, JobHistory.java has framework code, which creates/deletes job history files, manages history files and etc.
          Hide
          Chris Douglas added a comment -

          Agree with Amareshwari that not all of the package is public, and that we probably need a public API to it, but it's not clear what it is yet. By marking as public evolving we can evolve it to make things private that should be private. The other way works too - mark it private unstable or evolving and make public the parts that need to be made public in a later release. I can go either way.

          Chris, I wanted to say that the whole package is not public, though some of the classes in the package are public. For example, JobHistory.java has framework code, which creates/deletes job history files, manages history files and etc.

          OK, that's fair. There may be a few tool authors who would be interested in a stable API here, but I think Amareshwari is right, and whatever form that takes belongs with the rest of the client APIs.

          I'm a bit confused about what happened here. In 0.20 there was only o.a.h.filecache.DistributedCache, which was actually in Core. It's been deprecated in trunk in favour of o.a.h.filecache.mapreduce.DistributedCache, but this has been deprecated too - in favour of methods on Job. So, yes, mapreduce.filecache package should be Private, and the deprecation message on o.a.h.filecache.DistributedCache should direct users to Job.

          I found this confusing, also. It looks like most of the javadoc for mapreduce.filecache can be moved back to o.a.h.filecache.DistributedCache, then the former package can be marked as Private.

          Thanks for updating the patch with this. Is this the only change you made? (It's hard to see with such a big patch.)

          Yes, those should be the only changes.

          Show
          Chris Douglas added a comment - Agree with Amareshwari that not all of the package is public, and that we probably need a public API to it, but it's not clear what it is yet. By marking as public evolving we can evolve it to make things private that should be private. The other way works too - mark it private unstable or evolving and make public the parts that need to be made public in a later release. I can go either way. Chris, I wanted to say that the whole package is not public, though some of the classes in the package are public. For example, JobHistory.java has framework code, which creates/deletes job history files, manages history files and etc. OK, that's fair. There may be a few tool authors who would be interested in a stable API here, but I think Amareshwari is right, and whatever form that takes belongs with the rest of the client APIs. I'm a bit confused about what happened here. In 0.20 there was only o.a.h.filecache.DistributedCache, which was actually in Core. It's been deprecated in trunk in favour of o.a.h.filecache.mapreduce.DistributedCache, but this has been deprecated too - in favour of methods on Job. So, yes, mapreduce.filecache package should be Private, and the deprecation message on o.a.h.filecache.DistributedCache should direct users to Job. I found this confusing, also. It looks like most of the javadoc for mapreduce.filecache can be moved back to o.a.h.filecache.DistributedCache , then the former package can be marked as Private. Thanks for updating the patch with this. Is this the only change you made? (It's hard to see with such a big patch.) Yes, those should be the only changes.
          Hide
          Amareshwari Sriramadasu added a comment -

          I'm a bit confused about what happened here. In 0.20 there was only o.a.h.filecache.DistributedCache, which was actually in Core. It's been deprecated in trunk in favour of o.a.h.filecache.mapreduce.DistributedCache, but this has been deprecated too - in favour of methods on Job.

          Let me clear some confusion here. As Tom said, In 0.20 there was only o.a.h.filecache.DistributedCache, which was actually in Core. It's been deprecated in trunk in favour of o.a.h.filecache.mapreduce.DistributedCache, but this has been deprecated too - in favour of methods on Job. Here, o.a.h.mapreduce.filecache is changed to have only framework code and client facing apis are moved to Job.

          So, yes, mapreduce.filecache package should be Private, and the deprecation message on o.a.h.filecache.DistributedCache should direct users to Job.

          +1

          It looks like most of the javadoc for mapreduce.filecache can be moved back to o.a.h.filecache.DistributedCache.

          Chris, the use of new api for distributed cache is documented in forrest. But, I think we can move the javadoc for the use of old api to o.a.h.filecache.DistributedCache.

          Show
          Amareshwari Sriramadasu added a comment - I'm a bit confused about what happened here. In 0.20 there was only o.a.h.filecache.DistributedCache, which was actually in Core. It's been deprecated in trunk in favour of o.a.h.filecache.mapreduce.DistributedCache, but this has been deprecated too - in favour of methods on Job. Let me clear some confusion here. As Tom said, In 0.20 there was only o.a.h.filecache.DistributedCache, which was actually in Core. It's been deprecated in trunk in favour of o.a.h.filecache.mapreduce.DistributedCache, but this has been deprecated too - in favour of methods on Job. Here, o.a.h.mapreduce.filecache is changed to have only framework code and client facing apis are moved to Job. So, yes, mapreduce.filecache package should be Private, and the deprecation message on o.a.h.filecache.DistributedCache should direct users to Job. +1 It looks like most of the javadoc for mapreduce.filecache can be moved back to o.a.h.filecache.DistributedCache. Chris, the use of new api for distributed cache is documented in forrest. But, I think we can move the javadoc for the use of old api to o.a.h.filecache.DistributedCache.
          Hide
          Chris Douglas added a comment -

          Chris, the use of new api for distributed cache is documented in forrest. But, I think we can move the javadoc for the use of old api to o.a.h.filecache.DistributedCache.

          Sorry if I was unclear, but this is what I was advocating for.

          Show
          Chris Douglas added a comment - Chris, the use of new api for distributed cache is documented in forrest. But, I think we can move the javadoc for the use of old api to o.a.h.filecache.DistributedCache. Sorry if I was unclear, but this is what I was advocating for.
          Hide
          Tom White added a comment -

          New patch which

          • Leaves o.a.h.mapreduce.jobhistory as Private Unstable.
          • Makes o.a.h.util Private.
          • Makes mapreduce.filecache Private and moves the DistributedCache class javadoc back to o.a.h.filecache.DistributedCache. In doing this I noticed that the methods for o.a.h.filecache.DistributedCache have no javadoc, since they are inherited from o.a.h.mapreduce.filecache.DistributedCache which is now Private. I suggest sorting out the DistributedCache migration path in another JIRA, since this one is pretty unwieldly already.
          • Makes the mapreduce.lib packages Evolving.
          Show
          Tom White added a comment - New patch which Leaves o.a.h.mapreduce.jobhistory as Private Unstable. Makes o.a.h.util Private. Makes mapreduce.filecache Private and moves the DistributedCache class javadoc back to o.a.h.filecache.DistributedCache. In doing this I noticed that the methods for o.a.h.filecache.DistributedCache have no javadoc, since they are inherited from o.a.h.mapreduce.filecache.DistributedCache which is now Private. I suggest sorting out the DistributedCache migration path in another JIRA, since this one is pretty unwieldly already. Makes the mapreduce.lib packages Evolving.
          Hide
          Tom White added a comment -

          Attaching the right patch this time...

          Show
          Tom White added a comment - Attaching the right patch this time...
          Hide
          Tom White added a comment -

          Here's the javadoc and JDiff with 0.20 generated with this patch applied.

          Show
          Tom White added a comment - Here's the javadoc and JDiff with 0.20 generated with this patch applied.
          Hide
          Tom White added a comment -

          I've removed the Evolving annotation from the new API except for Job and Cluster (following some discussion at http://www.mail-archive.com/mapreduce-dev@hadoop.apache.org/msg01833.html). I've done the same for mapreduce.lib classes that were present in the old API. The ones that I've left as Evolving are WrappedMapper, WrappedReduce, BinaryPartitioner, and the new classes in mapreduce.lib.db.

          Owen - can you please take a look?

          Show
          Tom White added a comment - I've removed the Evolving annotation from the new API except for Job and Cluster (following some discussion at http://www.mail-archive.com/mapreduce-dev@hadoop.apache.org/msg01833.html ). I've done the same for mapreduce.lib classes that were present in the old API. The ones that I've left as Evolving are WrappedMapper, WrappedReduce, BinaryPartitioner, and the new classes in mapreduce.lib.db. Owen - can you please take a look?
          Hide
          Doug Cutting added a comment -

          > The warnings in javadoc are redundant with the annotations [ ... ]

          Indeed, but annotations are not very prominent. The "evolving" status should be nearly as prominent as "deprecated": we want folks to be aware when they use an evolving API that their application might break when they upgrade Hadoop. We don't want folks to be surprised by this or else we'll be dealing with upset users. Deprecations, by adding something to the first javadoc sentence, are visible in the package and method summary tables, while the annotations are visible in the detailed javadoc, and even then in a smaller typeface so they are easily missed. We might skip the bold that's used for deprecation, but adding something to the first sentence for evolving methods and classes will go a long way towards reducing upgrade surprises.

          Show
          Doug Cutting added a comment - > The warnings in javadoc are redundant with the annotations [ ... ] Indeed, but annotations are not very prominent. The "evolving" status should be nearly as prominent as "deprecated": we want folks to be aware when they use an evolving API that their application might break when they upgrade Hadoop. We don't want folks to be surprised by this or else we'll be dealing with upset users. Deprecations, by adding something to the first javadoc sentence, are visible in the package and method summary tables, while the annotations are visible in the detailed javadoc, and even then in a smaller typeface so they are easily missed. We might skip the bold that's used for deprecation, but adding something to the first sentence for evolving methods and classes will go a long way towards reducing upgrade surprises.
          Hide
          Tom White added a comment -

          This is HADOOP-6675.

          Show
          Tom White added a comment - This is HADOOP-6675 .
          Hide
          Owen O'Malley added a comment -

          o.a.h.mapreduce:

          public, evolving: Cluster, ClusterMetrics, Job, JobCounter, JobPriority, JobStatus, MarkableIterator, MarkableIteratorInterface,QueueAclsInfo, QueueInfo, QueueState, TaskCounter, TaskReport, TaskTrackerInfo, TaskType

          public,stable: Mapper, Reducer, Counter, CounterGroup, Counters, ID, InputFormat, InputSplit, JobContext, JobCounter, JobID, MapContext, Mapper, OutputCommitter, OutputFormat, Partitioner, RecordReader, RecordWriter, ReduceContext, Reducer, TaskAttemptContext, TaskAttemptID, TaskID, TaskInputOutputContext

          private: JobACL, JobSubmissionFiles, JobSubmitter, MRConfig, StatusReporter, TaskCompletionEvent, TaskReport

          o.a.h.mapreduce.map:
          public, stable: InverseMapper, TokenCounterMapper

          o.a..h.mapreduce.lib.output:
          public, stable: NullOutputFormat

          o.a.h.mapreduce.lib.partition:
          public, stable: BinaryPartitioner, HashPartitioner, InputSampler, TotalOrderPartitioner

          o.a.h.mapreduce.lib.reduce:
          public, stable: IntSumReducer, LongSumReducer

          o.a.h.mapred:
          public, stable: Counters, ClusterStatus, FileInputFormat, FileOutputCommitter, FileOutputFormat, FileSplit, ID, InputFormat, InputSplit, IsolationRunner, JobClient, JobConf, JobConfigurable, JobID, JobPriority, MapRunnable, MapRunner, Mapper, MultiFileInputFormat, MultiFileSplit, OutputCollector, OutputCommitter, OutputFormat, Partitioner, RecordReader, RecordWriter, Reducer, Reporter, RunningJob, SequenceFileAsBinaryInputFormat, SequenceFileAsBinaryOutputFormat,SequenceFileAsTextInputFormat, SequenceFileAsTextRecordReader, SequenceFileInputFilter, SequenceFileInputFormat, SequenceFileOutputFormat, SequenceFileRecordReader, TaskAttemptID, TaskID, TextInputFormat, TextOutputFormat

          o.a.h.mapred.lib:
          public, stable: BinaryPartitioner, Chain, ChainMapper, ChainReducer, CombineFileInputFormat, CombineFileRecordReader, CombineFileSplit, DelegatingInputFormat, DelegationMapper, FieldSelectionMapReduce, FilterOutputFormat, HashPartitioner, IdentityMapper, IdentityReducer, InputSampler, InverseMapper, KeyFieldBasedComparator, KeyFieldBasedPartitioner, LazyOutputFormat, LongSumReducer, MultipleInputs, MultipleOutputFormat, MultipleSequenceFileOutputFormat, MultipleTextOutputFormat, MultithreadedMapRunner, NLineInputFormat, NullOutputFormat, RegexMapper, TaggedInputSplit, TokenCountMapper, TotalOrderPartitioner

          Show
          Owen O'Malley added a comment - o.a.h.mapreduce: public, evolving: Cluster, ClusterMetrics, Job, JobCounter, JobPriority, JobStatus, MarkableIterator, MarkableIteratorInterface,QueueAclsInfo, QueueInfo, QueueState, TaskCounter, TaskReport, TaskTrackerInfo, TaskType public,stable: Mapper, Reducer, Counter, CounterGroup, Counters, ID, InputFormat, InputSplit, JobContext, JobCounter, JobID, MapContext, Mapper, OutputCommitter, OutputFormat, Partitioner, RecordReader, RecordWriter, ReduceContext, Reducer, TaskAttemptContext, TaskAttemptID, TaskID, TaskInputOutputContext private: JobACL, JobSubmissionFiles, JobSubmitter, MRConfig, StatusReporter, TaskCompletionEvent, TaskReport o.a.h.mapreduce.map: public, stable: InverseMapper, TokenCounterMapper o.a..h.mapreduce.lib.output: public, stable: NullOutputFormat o.a.h.mapreduce.lib.partition: public, stable: BinaryPartitioner, HashPartitioner, InputSampler, TotalOrderPartitioner o.a.h.mapreduce.lib.reduce: public, stable: IntSumReducer, LongSumReducer o.a.h.mapred: public, stable: Counters, ClusterStatus, FileInputFormat, FileOutputCommitter, FileOutputFormat, FileSplit, ID, InputFormat, InputSplit, IsolationRunner, JobClient, JobConf, JobConfigurable, JobID, JobPriority, MapRunnable, MapRunner, Mapper, MultiFileInputFormat, MultiFileSplit, OutputCollector, OutputCommitter, OutputFormat, Partitioner, RecordReader, RecordWriter, Reducer, Reporter, RunningJob, SequenceFileAsBinaryInputFormat, SequenceFileAsBinaryOutputFormat,SequenceFileAsTextInputFormat, SequenceFileAsTextRecordReader, SequenceFileInputFilter, SequenceFileInputFormat, SequenceFileOutputFormat, SequenceFileRecordReader, TaskAttemptID, TaskID, TextInputFormat, TextOutputFormat o.a.h.mapred.lib: public, stable: BinaryPartitioner, Chain, ChainMapper, ChainReducer, CombineFileInputFormat, CombineFileRecordReader, CombineFileSplit, DelegatingInputFormat, DelegationMapper, FieldSelectionMapReduce, FilterOutputFormat, HashPartitioner, IdentityMapper, IdentityReducer, InputSampler, InverseMapper, KeyFieldBasedComparator, KeyFieldBasedPartitioner, LazyOutputFormat, LongSumReducer, MultipleInputs, MultipleOutputFormat, MultipleSequenceFileOutputFormat, MultipleTextOutputFormat, MultithreadedMapRunner, NLineInputFormat, NullOutputFormat, RegexMapper, TaggedInputSplit, TokenCountMapper, TotalOrderPartitioner
          Hide
          Tom White added a comment -

          Owen,

          Thanks for the feedback. I've created a new patch that reflects these classifications. I've also got a few questions and comments below.

          o.a.h.mapreduce:

          Job has a public method that returns TaskCompletionEvents - so shouldn't this be public evolving? Similarly for TaskReport. Also, note that MapContext and ReduceContext are already public evolving (see MAPREDUCE-1012), so I have left them like that.

          o.a.h.mapreduce.lib.map:

          MultithreadedMapper and RegexMapper are also in this package. Are you saying these are not stable - so mark as public evolving? This question goes for the other mapreduce.lib packages.

          o.a.h.mapred:

          I think these should be public stable too:
          FileAlreadyExistsException, InvalidFileTypeException, InvalidInputException, InvalidJobConfException, JobContext, JobStatus, JobQueueInfo, KeyValueLineRecordReader, KeyValueTextInputFormat, MapFileOutputFormat, MapReduceBase, OutputLogFilter, SkipBadRecords, TaskAttemptContext, TaskReport, Utils

          o.a.h.mapred.lib:

          I think MultipleOutputs should be public stable too.

          Show
          Tom White added a comment - Owen, Thanks for the feedback. I've created a new patch that reflects these classifications. I've also got a few questions and comments below. o.a.h.mapreduce: Job has a public method that returns TaskCompletionEvents - so shouldn't this be public evolving? Similarly for TaskReport. Also, note that MapContext and ReduceContext are already public evolving (see MAPREDUCE-1012 ), so I have left them like that. o.a.h.mapreduce.lib.map: MultithreadedMapper and RegexMapper are also in this package. Are you saying these are not stable - so mark as public evolving? This question goes for the other mapreduce.lib packages. o.a.h.mapred: I think these should be public stable too: FileAlreadyExistsException, InvalidFileTypeException, InvalidInputException, InvalidJobConfException, JobContext, JobStatus, JobQueueInfo, KeyValueLineRecordReader, KeyValueTextInputFormat, MapFileOutputFormat, MapReduceBase, OutputLogFilter, SkipBadRecords, TaskAttemptContext, TaskReport, Utils o.a.h.mapred.lib: I think MultipleOutputs should be public stable too.
          Hide
          Arun C Murthy added a comment -

          I've not yet gone through the o.a.h.mapreduce.security and o.a.h.mapred packages, I'll do that later tonight, but here are early comments for the rest:

          src/java/org/apache/hadoop/mapreduce/MRConfig.java
          -> Stable?
          src/java/org/apache/hadoop/mapreduce/TaskType.java
          -> Public? Stable?
          src/java/org/apache/hadoop/mapreduce/StatusReporter.java
          -> Stable or Evolving?
          src/java/org/apache/hadoop/mapreduce/JobSubmitter.java
          -> Unstable?
          src/java/org/apache/hadoop/mapreduce/split/package-info.java
          -> Unstable?
          src/java/org/apache/hadoop/mapreduce/JobStatus.java
          -> Public?
          src/java/org/apache/hadoop/mapreduce/protocol/package-info.java
          -> Stable?
          src/java/org/apache/hadoop/mapreduce/task/package-info.java
          -> Unstable?
          src/java/org/apache/hadoop/mapreduce/task/reduce/package-info.java
          -> Unstable?
          src/java/org/apache/hadoop/mapreduce/task/reduce/ShuffleHeader.java
          -> Stable? (I'd like to keep it so, at least, thoughts?)
          src/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java
          -> Stable? Limited-Private (Pig) ?
          src/java/org/apache/hadoop/mapreduce/lib/db/OracleDataDrivenDBRecordReader.java
          -> private for extension?
          src/java/org/apache/hadoop/mapreduce/lib/db/MySQLDataDrivenDBRecordReader.java
          -> public for extension?
          src/java/org/apache/hadoop/mapreduce/lib/db/DateSplitter.java
          -> private?
          src/java/org/apache/hadoop/mapreduce/lib/db/OracleDataDrivenDBInputFormat.java
          -> public?
          src/java/org/apache/hadoop/mapreduce/lib/db/BooleanSplitter.java
          -> private?
          src/java/org/apache/hadoop/mapreduce/lib/db/BigDecimalSplitter.java
          -> private?
          src/java/org/apache/hadoop/mapreduce/lib/db/DBRecordReader.java
          -> public?
          src/java/org/apache/hadoop/mapreduce/lib/db/DataDrivenDBRecordReader.java
          -> public?
          src/java/org/apache/hadoop/mapreduce/lib/db/TextSplitter.java
          -> private?
          src/java/org/apache/hadoop/mapreduce/lib/db/DBSplitter.java
          -> private?
          src/java/org/apache/hadoop/mapreduce/lib/db/DBConfiguration.java
          -> private?
          src/java/org/apache/hadoop/mapreduce/lib/db/DBInputFormat.java
          -> public?
          src/java/org/apache/hadoop/mapreduce/lib/db/DBOutputFormat.java
          -> public?
          src/java/org/apache/hadoop/mapreduce/lib/db/DBWritable.java
          -> private?
          src/java/org/apache/hadoop/mapreduce/lib/db/DataDrivenDBInputFormat.java
          -> public?
          src/java/org/apache/hadoop/mapreduce/lib/db/IntegerSplitter.java
          -> private?
          src/java/org/apache/hadoop/mapreduce/lib/db/OracleDateSplitter.java
          -> private?
          src/java/org/apache/hadoop/mapreduce/lib/db/FloatSplitter.java
          -> private?
          src/java/org/apache/hadoop/mapreduce/lib/db/MySQLDBRecordReader.java
          -> public?
          src/java/org/apache/hadoop/mapreduce/lib/db/OracleDBRecordReader.java
          -> public?
          src/java/org/apache/hadoop/mapreduce/lib/reduce/WrappedReducer.java
          -> public?
          src/java/org/apache/hadoop/mapreduce/lib/partition/BinaryPartitioner.java
          -> public?
          src/java/org/apache/hadoop/mapreduce/lib/map/WrappedMapper.java
          -> public?
          src/java/org/apache/hadoop/mapreduce/server/jobtracker/State.java
          -> public?
          src/java/org/apache/hadoop/mapreduce/server/jobtracker/JTConfig.java
          -> private?
          src/java/org/apache/hadoop/mapreduce/server/jobtracker/TaskTracker.java
          -> limited-private for Schedulers
          src/java/org/apache/hadoop/mapreduce/server/tasktracker/TTConfig.java
          -> private?
          src/java/org/apache/hadoop/mapreduce/MarkableIterator.java
          -> private?
          src/java/org/apache/hadoop/mapreduce/TaskCounter.java
          -> public?
          src/java/org/apache/hadoop/mapreduce/QueueAclsInfo.java
          -> private?
          src/java/org/apache/hadoop/mapreduce/ClusterMetrics.java
          -> public
          src/java/org/apache/hadoop/mapreduce/MarkableIteratorInterface.java
          -> private
          src/java/org/apache/hadoop/mapreduce/Job.java
          -> public
          src/java/org/apache/hadoop/mapreduce/JobCounter.java
          -> public
          src/java/org/apache/hadoop/mapreduce/JobPriority.java
          -> public
          src/java/org/apache/hadoop/mapreduce/jobhistory/package-info.java
          -> private
          src/java/org/apache/hadoop/mapreduce/Cluster.java
          -> public
          src/java/org/apache/hadoop/mapreduce/QueueState.java
          -> private

          Show
          Arun C Murthy added a comment - I've not yet gone through the o.a.h.mapreduce.security and o.a.h.mapred packages, I'll do that later tonight, but here are early comments for the rest: src/java/org/apache/hadoop/mapreduce/MRConfig.java -> Stable? src/java/org/apache/hadoop/mapreduce/TaskType.java -> Public? Stable? src/java/org/apache/hadoop/mapreduce/StatusReporter.java -> Stable or Evolving? src/java/org/apache/hadoop/mapreduce/JobSubmitter.java -> Unstable? src/java/org/apache/hadoop/mapreduce/split/package-info.java -> Unstable? src/java/org/apache/hadoop/mapreduce/JobStatus.java -> Public? src/java/org/apache/hadoop/mapreduce/protocol/package-info.java -> Stable? src/java/org/apache/hadoop/mapreduce/task/package-info.java -> Unstable? src/java/org/apache/hadoop/mapreduce/task/reduce/package-info.java -> Unstable? src/java/org/apache/hadoop/mapreduce/task/reduce/ShuffleHeader.java -> Stable? (I'd like to keep it so, at least, thoughts?) src/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java -> Stable? Limited-Private (Pig) ? src/java/org/apache/hadoop/mapreduce/lib/db/OracleDataDrivenDBRecordReader.java -> private for extension? src/java/org/apache/hadoop/mapreduce/lib/db/MySQLDataDrivenDBRecordReader.java -> public for extension? src/java/org/apache/hadoop/mapreduce/lib/db/DateSplitter.java -> private? src/java/org/apache/hadoop/mapreduce/lib/db/OracleDataDrivenDBInputFormat.java -> public? src/java/org/apache/hadoop/mapreduce/lib/db/BooleanSplitter.java -> private? src/java/org/apache/hadoop/mapreduce/lib/db/BigDecimalSplitter.java -> private? src/java/org/apache/hadoop/mapreduce/lib/db/DBRecordReader.java -> public? src/java/org/apache/hadoop/mapreduce/lib/db/DataDrivenDBRecordReader.java -> public? src/java/org/apache/hadoop/mapreduce/lib/db/TextSplitter.java -> private? src/java/org/apache/hadoop/mapreduce/lib/db/DBSplitter.java -> private? src/java/org/apache/hadoop/mapreduce/lib/db/DBConfiguration.java -> private? src/java/org/apache/hadoop/mapreduce/lib/db/DBInputFormat.java -> public? src/java/org/apache/hadoop/mapreduce/lib/db/DBOutputFormat.java -> public? src/java/org/apache/hadoop/mapreduce/lib/db/DBWritable.java -> private? src/java/org/apache/hadoop/mapreduce/lib/db/DataDrivenDBInputFormat.java -> public? src/java/org/apache/hadoop/mapreduce/lib/db/IntegerSplitter.java -> private? src/java/org/apache/hadoop/mapreduce/lib/db/OracleDateSplitter.java -> private? src/java/org/apache/hadoop/mapreduce/lib/db/FloatSplitter.java -> private? src/java/org/apache/hadoop/mapreduce/lib/db/MySQLDBRecordReader.java -> public? src/java/org/apache/hadoop/mapreduce/lib/db/OracleDBRecordReader.java -> public? src/java/org/apache/hadoop/mapreduce/lib/reduce/WrappedReducer.java -> public? src/java/org/apache/hadoop/mapreduce/lib/partition/BinaryPartitioner.java -> public? src/java/org/apache/hadoop/mapreduce/lib/map/WrappedMapper.java -> public? src/java/org/apache/hadoop/mapreduce/server/jobtracker/State.java -> public? src/java/org/apache/hadoop/mapreduce/server/jobtracker/JTConfig.java -> private? src/java/org/apache/hadoop/mapreduce/server/jobtracker/TaskTracker.java -> limited-private for Schedulers src/java/org/apache/hadoop/mapreduce/server/tasktracker/TTConfig.java -> private? src/java/org/apache/hadoop/mapreduce/MarkableIterator.java -> private? src/java/org/apache/hadoop/mapreduce/TaskCounter.java -> public? src/java/org/apache/hadoop/mapreduce/QueueAclsInfo.java -> private? src/java/org/apache/hadoop/mapreduce/ClusterMetrics.java -> public src/java/org/apache/hadoop/mapreduce/MarkableIteratorInterface.java -> private src/java/org/apache/hadoop/mapreduce/Job.java -> public src/java/org/apache/hadoop/mapreduce/JobCounter.java -> public src/java/org/apache/hadoop/mapreduce/JobPriority.java -> public src/java/org/apache/hadoop/mapreduce/jobhistory/package-info.java -> private src/java/org/apache/hadoop/mapreduce/Cluster.java -> public src/java/org/apache/hadoop/mapreduce/QueueState.java -> private
          Hide
          Tom White added a comment -

          Thanks Arun - really useful getting all this feedback. I've attached an updated patch, see also my comments below.

          > src/java/org/apache/hadoop/mapreduce/MRConfig.java
          > -> Stable?
          It's private, so it matters less, but we should be able to refactor this if it's an internal class, no? So maybe evolving/unstable.

          > src/java/org/apache/hadoop/mapreduce/TaskType.java
          > -> Public? Stable?

          It is public. Not sure about whether it is stable yet.

          > src/java/org/apache/hadoop/mapreduce/StatusReporter.java
          > -> Stable or Evolving?

          Private, so it matters less.

          > src/java/org/apache/hadoop/mapreduce/JobSubmitter.java
          > -> Unstable?

          Changed.

          > src/java/org/apache/hadoop/mapreduce/split/package-info.java
          > -> Unstable?

          Changed.

          > src/java/org/apache/hadoop/mapreduce/JobStatus.java
          > -> Public?

          Yes.

          > src/java/org/apache/hadoop/mapreduce/protocol/package-info.java
          > -> Stable?

          Changed.

          > src/java/org/apache/hadoop/mapreduce/task/package-info.java
          > -> Unstable?

          Changed

          > src/java/org/apache/hadoop/mapreduce/task/reduce/package-info.java
          > -> Unstable?

          Changed

          > src/java/org/apache/hadoop/mapreduce/task/reduce/ShuffleHeader.java
          > -> Stable? (I'd like to keep it so, at least, thoughts?)

          Changed.

          > src/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java
          > -> Stable? Limited-Private (Pig) ?

          Changed to Evolving, Limited-Private (MR, Pig)

          > src/java/org/apache/hadoop/mapreduce/lib/db/OracleDataDrivenDBRecordReader.java
          > -> private for extension?
          > src/java/org/apache/hadoop/mapreduce/lib/db/MySQLDataDrivenDBRecordReader.java
          > -> public for extension?
          > src/java/org/apache/hadoop/mapreduce/lib/db/DateSplitter.java
          > -> private?
          > src/java/org/apache/hadoop/mapreduce/lib/db/OracleDataDrivenDBInputFormat.java
          > -> public?
          > src/java/org/apache/hadoop/mapreduce/lib/db/BooleanSplitter.java
          > -> private?
          > src/java/org/apache/hadoop/mapreduce/lib/db/BigDecimalSplitter.java
          > -> private?
          > src/java/org/apache/hadoop/mapreduce/lib/db/DBRecordReader.java
          > -> public?
          > src/java/org/apache/hadoop/mapreduce/lib/db/DataDrivenDBRecordReader.java
          > -> public?
          > src/java/org/apache/hadoop/mapreduce/lib/db/TextSplitter.java
          > -> private?
          > src/java/org/apache/hadoop/mapreduce/lib/db/DBSplitter.java
          > -> private?
          > src/java/org/apache/hadoop/mapreduce/lib/db/DBConfiguration.java
          > -> private?
          > src/java/org/apache/hadoop/mapreduce/lib/db/DBInputFormat.java
          > -> public?
          > src/java/org/apache/hadoop/mapreduce/lib/db/DBOutputFormat.java
          > -> public?
          > src/java/org/apache/hadoop/mapreduce/lib/db/DBWritable.java
          > -> private?

          Needs to be public since client programs can use it.

          > src/java/org/apache/hadoop/mapreduce/lib/db/DataDrivenDBInputFormat.java
          > -> public?
          > src/java/org/apache/hadoop/mapreduce/lib/db/IntegerSplitter.java
          > -> private?
          > src/java/org/apache/hadoop/mapreduce/lib/db/OracleDateSplitter.java
          > -> private?
          > src/java/org/apache/hadoop/mapreduce/lib/db/FloatSplitter.java
          > -> private?
          > src/java/org/apache/hadoop/mapreduce/lib/db/MySQLDBRecordReader.java
          > -> public?
          > src/java/org/apache/hadoop/mapreduce/lib/db/OracleDBRecordReader.java
          > -> public?

          I left all of the lib.db classes public evolving since they are used by various extensions and users. Does this sound reasonable?

          > src/java/org/apache/hadoop/mapreduce/lib/reduce/WrappedReducer.java
          > -> public?

          Yes.

          > src/java/org/apache/hadoop/mapreduce/lib/partition/BinaryPartitioner.java
          > -> public?

          Yes.

          > src/java/org/apache/hadoop/mapreduce/lib/map/WrappedMapper.java
          > -> public?

          Yes.

          > src/java/org/apache/hadoop/mapreduce/server/jobtracker/State.java
          > -> public?

          Yes.

          > src/java/org/apache/hadoop/mapreduce/server/jobtracker/JTConfig.java
          > -> private?

          Changed.

          > src/java/org/apache/hadoop/mapreduce/server/jobtracker/TaskTracker.java
          > -> limited-private for Schedulers

          Changed.

          > src/java/org/apache/hadoop/mapreduce/server/tasktracker/TTConfig.java
          > -> private?

          Changed.

          > src/java/org/apache/hadoop/mapreduce/MarkableIterator.java
          > -> private?

          Are MarkableIterator and MarkableIteratorInterface public user classes? The unit test for them (TestValueIterReset) implies that the user instantiates a MarkableIterator.

          > src/java/org/apache/hadoop/mapreduce/TaskCounter.java
          > -> public?

          Yes.

          > src/java/org/apache/hadoop/mapreduce/QueueAclsInfo.java
          > -> private?

          It's returned by a public method on Cluster.

          > src/java/org/apache/hadoop/mapreduce/ClusterMetrics.java
          > -> public

          Yes.

          > src/java/org/apache/hadoop/mapreduce/MarkableIteratorInterface.java
          > -> private

          See above.

          > src/java/org/apache/hadoop/mapreduce/Job.java
          > -> public

          Yes.

          > src/java/org/apache/hadoop/mapreduce/JobCounter.java
          > -> public

          Yes.

          > src/java/org/apache/hadoop/mapreduce/JobPriority.java
          > -> public

          Yes.

          > src/java/org/apache/hadoop/mapreduce/jobhistory/package-info.java
          > -> private

          Yes.

          > src/java/org/apache/hadoop/mapreduce/Cluster.java
          > -> public

          Yes.

          > src/java/org/apache/hadoop/mapreduce/QueueState.java
          > -> private

          Referenced from the public QueueInfo class.

          Show
          Tom White added a comment - Thanks Arun - really useful getting all this feedback. I've attached an updated patch, see also my comments below. > src/java/org/apache/hadoop/mapreduce/MRConfig.java > -> Stable? It's private, so it matters less, but we should be able to refactor this if it's an internal class, no? So maybe evolving/unstable. > src/java/org/apache/hadoop/mapreduce/TaskType.java > -> Public? Stable? It is public. Not sure about whether it is stable yet. > src/java/org/apache/hadoop/mapreduce/StatusReporter.java > -> Stable or Evolving? Private, so it matters less. > src/java/org/apache/hadoop/mapreduce/JobSubmitter.java > -> Unstable? Changed. > src/java/org/apache/hadoop/mapreduce/split/package-info.java > -> Unstable? Changed. > src/java/org/apache/hadoop/mapreduce/JobStatus.java > -> Public? Yes. > src/java/org/apache/hadoop/mapreduce/protocol/package-info.java > -> Stable? Changed. > src/java/org/apache/hadoop/mapreduce/task/package-info.java > -> Unstable? Changed > src/java/org/apache/hadoop/mapreduce/task/reduce/package-info.java > -> Unstable? Changed > src/java/org/apache/hadoop/mapreduce/task/reduce/ShuffleHeader.java > -> Stable? (I'd like to keep it so, at least, thoughts?) Changed. > src/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java > -> Stable? Limited-Private (Pig) ? Changed to Evolving, Limited-Private (MR, Pig) > src/java/org/apache/hadoop/mapreduce/lib/db/OracleDataDrivenDBRecordReader.java > -> private for extension? > src/java/org/apache/hadoop/mapreduce/lib/db/MySQLDataDrivenDBRecordReader.java > -> public for extension? > src/java/org/apache/hadoop/mapreduce/lib/db/DateSplitter.java > -> private? > src/java/org/apache/hadoop/mapreduce/lib/db/OracleDataDrivenDBInputFormat.java > -> public? > src/java/org/apache/hadoop/mapreduce/lib/db/BooleanSplitter.java > -> private? > src/java/org/apache/hadoop/mapreduce/lib/db/BigDecimalSplitter.java > -> private? > src/java/org/apache/hadoop/mapreduce/lib/db/DBRecordReader.java > -> public? > src/java/org/apache/hadoop/mapreduce/lib/db/DataDrivenDBRecordReader.java > -> public? > src/java/org/apache/hadoop/mapreduce/lib/db/TextSplitter.java > -> private? > src/java/org/apache/hadoop/mapreduce/lib/db/DBSplitter.java > -> private? > src/java/org/apache/hadoop/mapreduce/lib/db/DBConfiguration.java > -> private? > src/java/org/apache/hadoop/mapreduce/lib/db/DBInputFormat.java > -> public? > src/java/org/apache/hadoop/mapreduce/lib/db/DBOutputFormat.java > -> public? > src/java/org/apache/hadoop/mapreduce/lib/db/DBWritable.java > -> private? Needs to be public since client programs can use it. > src/java/org/apache/hadoop/mapreduce/lib/db/DataDrivenDBInputFormat.java > -> public? > src/java/org/apache/hadoop/mapreduce/lib/db/IntegerSplitter.java > -> private? > src/java/org/apache/hadoop/mapreduce/lib/db/OracleDateSplitter.java > -> private? > src/java/org/apache/hadoop/mapreduce/lib/db/FloatSplitter.java > -> private? > src/java/org/apache/hadoop/mapreduce/lib/db/MySQLDBRecordReader.java > -> public? > src/java/org/apache/hadoop/mapreduce/lib/db/OracleDBRecordReader.java > -> public? I left all of the lib.db classes public evolving since they are used by various extensions and users. Does this sound reasonable? > src/java/org/apache/hadoop/mapreduce/lib/reduce/WrappedReducer.java > -> public? Yes. > src/java/org/apache/hadoop/mapreduce/lib/partition/BinaryPartitioner.java > -> public? Yes. > src/java/org/apache/hadoop/mapreduce/lib/map/WrappedMapper.java > -> public? Yes. > src/java/org/apache/hadoop/mapreduce/server/jobtracker/State.java > -> public? Yes. > src/java/org/apache/hadoop/mapreduce/server/jobtracker/JTConfig.java > -> private? Changed. > src/java/org/apache/hadoop/mapreduce/server/jobtracker/TaskTracker.java > -> limited-private for Schedulers Changed. > src/java/org/apache/hadoop/mapreduce/server/tasktracker/TTConfig.java > -> private? Changed. > src/java/org/apache/hadoop/mapreduce/MarkableIterator.java > -> private? Are MarkableIterator and MarkableIteratorInterface public user classes? The unit test for them (TestValueIterReset) implies that the user instantiates a MarkableIterator. > src/java/org/apache/hadoop/mapreduce/TaskCounter.java > -> public? Yes. > src/java/org/apache/hadoop/mapreduce/QueueAclsInfo.java > -> private? It's returned by a public method on Cluster. > src/java/org/apache/hadoop/mapreduce/ClusterMetrics.java > -> public Yes. > src/java/org/apache/hadoop/mapreduce/MarkableIteratorInterface.java > -> private See above. > src/java/org/apache/hadoop/mapreduce/Job.java > -> public Yes. > src/java/org/apache/hadoop/mapreduce/JobCounter.java > -> public Yes. > src/java/org/apache/hadoop/mapreduce/JobPriority.java > -> public Yes. > src/java/org/apache/hadoop/mapreduce/jobhistory/package-info.java > -> private Yes. > src/java/org/apache/hadoop/mapreduce/Cluster.java > -> public Yes. > src/java/org/apache/hadoop/mapreduce/QueueState.java > -> private Referenced from the public QueueInfo class.
          Hide
          Arun C Murthy added a comment -

          Thanks for the update Tom, this is getting close!

          Some more comments:

          > src/java/org/apache/hadoop/mapreduce/TaskType.java
          > -> Public? Stable?
          It is public. Not sure about whether it is stable yet.

          I'd tend to lean towards making it Stable to ensure it's never broken, it probably deserves that status given how central it is to the framework. Thoughts?
          I don't see either 'audience' for this yet it the patch, maybe you missed it?

          Are MarkableIterator and MarkableIteratorInterface public user classes? The unit test for them (TestValueIterReset) implies that the user instantiates a MarkableIterator.

          +1, public

          I left all of the lib.db classes public evolving since they are used by various extensions and users. Does this sound reasonable?

          +1

          > src/java/org/apache/hadoop/mapreduce/Job.java
          > -> public

          Yes.

          > src/java/org/apache/hadoop/mapreduce/JobCounter.java
          > -> public

          Yes.

          > src/java/org/apache/hadoop/mapreduce/JobPriority.java
          > -> public

          Yes.

          Again, I don't see 'audience' as 'public', another human miss on your part? Or am I seeing the wrong patch?


          Comments on rest of the classes I missed in the first pass:

          src/java/org/apache/hadoop/mapreduce/security/token/package-info.java
          -> Unstable
          src/java/org/apache/hadoop/mapreduce/security/token/delegation/package-info.java
          -> Unstable
          src/java/org/apache/hadoop/mapreduce/TaskTrackerInfo.java
          -> Public, used in Cluster
          src/java/org/apache/hadoop/mapreduce/util/package-info.java
          -> Unstable
          src/java/org/apache/hadoop/mapreduce/TaskCompletionEvent.java
          -> Public, Job.getTaskCompletionEvent
          src/java/org/apache/hadoop/mapred/JobInProgress.java
          -> limited-private, for schedulers
          src/java/org/apache/hadoop/mapred/Task.java
          -> limited-private, for schedulers
          src/java/org/apache/hadoop/mapred/JobEndNotifier.java
          -> public, maybe evolving since end-users use this directly
          src/java/org/apache/hadoop/mapred/LineRecordReader.java
          -> limited-private (mr, pig)
          src/java/org/apache/hadoop/mapred/JobProfile.java
          -> Odd, I thought this was 'public' - but I can't find any uses for it. The FairScheduler uses it, so 'limited-private' ?
          src/java/org/apache/hadoop/mapred/TaskCompletionEvent.java
          -> public, accessible via JobClient

          Show
          Arun C Murthy added a comment - Thanks for the update Tom, this is getting close! Some more comments: > src/java/org/apache/hadoop/mapreduce/TaskType.java > -> Public? Stable? It is public. Not sure about whether it is stable yet. I'd tend to lean towards making it Stable to ensure it's never broken, it probably deserves that status given how central it is to the framework. Thoughts? I don't see either 'audience' for this yet it the patch, maybe you missed it? Are MarkableIterator and MarkableIteratorInterface public user classes? The unit test for them (TestValueIterReset) implies that the user instantiates a MarkableIterator. +1, public I left all of the lib.db classes public evolving since they are used by various extensions and users. Does this sound reasonable? +1 > src/java/org/apache/hadoop/mapreduce/Job.java > -> public Yes. > src/java/org/apache/hadoop/mapreduce/JobCounter.java > -> public Yes. > src/java/org/apache/hadoop/mapreduce/JobPriority.java > -> public Yes. Again, I don't see 'audience' as 'public', another human miss on your part? Or am I seeing the wrong patch? Comments on rest of the classes I missed in the first pass: src/java/org/apache/hadoop/mapreduce/security/token/package-info.java -> Unstable src/java/org/apache/hadoop/mapreduce/security/token/delegation/package-info.java -> Unstable src/java/org/apache/hadoop/mapreduce/TaskTrackerInfo.java -> Public, used in Cluster src/java/org/apache/hadoop/mapreduce/util/package-info.java -> Unstable src/java/org/apache/hadoop/mapreduce/TaskCompletionEvent.java -> Public, Job.getTaskCompletionEvent src/java/org/apache/hadoop/mapred/JobInProgress.java -> limited-private, for schedulers src/java/org/apache/hadoop/mapred/Task.java -> limited-private, for schedulers src/java/org/apache/hadoop/mapred/JobEndNotifier.java -> public, maybe evolving since end-users use this directly src/java/org/apache/hadoop/mapred/LineRecordReader.java -> limited-private (mr, pig) src/java/org/apache/hadoop/mapred/JobProfile.java -> Odd, I thought this was 'public' - but I can't find any uses for it. The FairScheduler uses it, so 'limited-private' ? src/java/org/apache/hadoop/mapred/TaskCompletionEvent.java -> public, accessible via JobClient
          Hide
          Tom White added a comment -

          Here's another patch which addresses all of Arun's points (except the one below). I've also gone through and explicitly used Public and Stable annotations for classes where they were previously implicit, which is why the patch is now so large. I've also made the Parser classes in the join packages Evolving (after offline discussion with Chris).

          > src/java/org/apache/hadoop/mapred/JobEndNotifier.java
          > -> public, maybe evolving since end-users use this directly

          I think this is private since its methods are only invoked by internal classes. Job notification is set up by users using JobConf.setJobEndNotificationURI(). There doesn't seem to be a similar method on Job for the new API, so you would have to configure using the property directly.

          Show
          Tom White added a comment - Here's another patch which addresses all of Arun's points (except the one below). I've also gone through and explicitly used Public and Stable annotations for classes where they were previously implicit, which is why the patch is now so large. I've also made the Parser classes in the join packages Evolving (after offline discussion with Chris). > src/java/org/apache/hadoop/mapred/JobEndNotifier.java > -> public, maybe evolving since end-users use this directly I think this is private since its methods are only invoked by internal classes. Job notification is set up by users using JobConf.setJobEndNotificationURI(). There doesn't seem to be a similar method on Job for the new API, so you would have to configure using the property directly.
          Hide
          Tom White added a comment -

          Output from test-patch:

               [exec] -1 overall.  
               [exec] 
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec] 
               [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
               [exec]                         Please justify why no new tests are needed for this patch.
               [exec]                         Also please list what manual steps were performed to verify this patch.
               [exec] 
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec] 
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec] 
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
               [exec] 
               [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
          

          No tests are needed since this is a documentation patch.

          Show
          Tom White added a comment - Output from test-patch: [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no new tests are needed for this patch. [exec] Also please list what manual steps were performed to verify this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. No tests are needed since this is a documentation patch.
          Hide
          Tom White added a comment -

          Patch no longer applies - here's a new one.

          Show
          Tom White added a comment - Patch no longer applies - here's a new one.
          Hide
          Arun C Murthy added a comment -

          Final comments:

          src/java/org/apache/hadoop/mapreduce/lib/jobcontrol/ControlledJob.java
          src/java/org/apache/hadoop/mapreduce/lib/jobcontrol/JobControl.java
          Both should be Public, Evolving - I don't think they are ready to be labelled 'stable' yet.

          src/java/org/apache/hadoop/mapreduce/QueueInfo.java -> Evolving

          src/java/org/apache/hadoop/mapred/IsolationRunner.java -> Evolving since I'm not sure IsolationRunner even works anymore.

          Show
          Arun C Murthy added a comment - Final comments: src/java/org/apache/hadoop/mapreduce/lib/jobcontrol/ControlledJob.java src/java/org/apache/hadoop/mapreduce/lib/jobcontrol/JobControl.java Both should be Public, Evolving - I don't think they are ready to be labelled 'stable' yet. src/java/org/apache/hadoop/mapreduce/QueueInfo.java -> Evolving src/java/org/apache/hadoop/mapred/IsolationRunner.java -> Evolving since I'm not sure IsolationRunner even works anymore.
          Hide
          Arun C Murthy added a comment -

          Updated patch since the previous one didn't apply clean, I've incorporated my own (final) comments.

          Tom, if you are fine with the proposed changes I'll go ahead and commit.

          Show
          Arun C Murthy added a comment - Updated patch since the previous one didn't apply clean, I've incorporated my own (final) comments. Tom, if you are fine with the proposed changes I'll go ahead and commit.
          Hide
          Tom White added a comment -

          +1

          Thanks Arun!

          Show
          Tom White added a comment - +1 Thanks Arun!
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12445008/MAPREDUCE-1623.patch
          against trunk revision 944427.

          +1 @author. The patch does not contain any @author tags.

          +0 tests included. The patch appears to be a documentation patch that doesn't require tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/194/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/194/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/194/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/194/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12445008/MAPREDUCE-1623.patch against trunk revision 944427. +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/194/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/194/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/194/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/194/console This message is automatically generated.
          Hide
          Arun C Murthy added a comment -

          I just committed this. Thanks Tom, this was a big one!

          Show
          Arun C Murthy added a comment - I just committed this. Thanks Tom, this was a big one!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #323 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/323/)
          MAPREDUCE-1623. Apply audience and stability notations to Hadoop Map-Reduce. Contributed by Tom White.

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #323 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/323/ ) MAPREDUCE-1623 . Apply audience and stability notations to Hadoop Map-Reduce. Contributed by Tom White.

            People

            • Assignee:
              Tom White
              Reporter:
              Tom White
            • Votes:
              1 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development