Hadoop Common
  1. Hadoop Common
  2. HADOOP-5064 Hadoop 1.0
  3. HADOOP-5073

Hadoop 1.0 Interface Classification - scope (visibility - public/private) and stability

    Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Annotation mechanism enables interface classification.

      Description

      This jira proposes an interface classification for hadoop interfaces.
      The discussion was started in email alias core-dev@hadoop.apache.org in Nov 2008.

      1. HADOOP-5073.patch
        6 kB
        Jakob Homan
      2. 5073_demo.png
        112 kB
        Jakob Homan
      3. c5073_20090825.patch
        1 kB
        Tsz Wo Nicholas Sze
      4. c5073_20090825.png
        35 kB
        Tsz Wo Nicholas Sze
      5. Nested.png
        144 kB
        Jakob Homan
      6. Picture 1.png
        27 kB
        Jakob Homan
      7. COMMON-5073.patch
        3 kB
        Jakob Homan
      8. COMMON-5073.patch
        3 kB
        Jakob Homan
      9. COMMON-5073.patch
        4 kB
        Jakob Homan

        Issue Links

          Activity

          Hide
          Sanjay Radia added a comment - - edited

          A proposal:

          Interface Taxonomy - Scope & Stability Classification

          The interface taxonomy classification provided here is for guidance to developers and users of interfaces.
          The classification guides a developer to declare the scope (or targeted audience or users) of an interface and also its stability.

          • Benefits to the user of an interface: Knows which interfaces to use or not use and their stability.
          • Benefits to the developer: to prevent accidental changes of interfaces and hence accidental impact on users or other components or system. This is particularly useful in large systems with many developers who may not all have a shared state/history of the project.

          This classification was derived from a taxonomy used inside Yahoo and
          from the OpenSolaris taxonomy (http://www.opensolaris.org/os/community/arc/policies/interface-taxonomy/#Advice)

          Interface have two main attributes: Scope and Stability

          • Scope - denotes the potential customers of the interface.
            For example many interfaces are merely internal or private interfaces of the implementation while others are public or external interfaces that applications or clients are expected to use. In posix, libc is an is an external or public interface, while large parts of the kernel are internal or private interfaces. In addition, some interfaces are targeted to some specific other subsystems. Identifying the scope helps define the customers or users of the interfaces and helps define the impact of breaking an interface. For example we may be willing to break the comaptibility of an interface whose scope is a small number of specific subsystems. One the other hand, one is unlikely to break a protocol interfaces that millions of internet users depend on.
            The following are useful scopes in order of increasing/wider visibility
            • project-private
              • the interface is for internal use within the project and should not be used by applications. It is subject to change at anytime without notice. Most interfaces of a project are project private.
            • limited-private
              • the interface is used by a specified set of projects or systems (typically closely related projects). Other projects or systems should not use the interface. Changes to the interface will be communicated/negotiated with the specified projects. For example, in the hadoop project, some interfaces are hdfs-mapReduce-private in that they are private to the hdfs and mapReduce projects.
            • company-private (This not applicable to opensource projects such as Hadoop. It is mentioned here for completeness.)
              • the interface can use used by other projects within a company.
            • public
              • the interface is for general use by any application.
          • Stability - denotes when changes can be made to the interface that break compatibility.
            • Stable
              • Can evolve while retaining compatibility for minor release boundaries.; can break compatibility only at major release (ie. at m.0).
            • Evolving
              • Evolving, but can break compatibility at minor release (i.e. m.x)
            • Unstable
              • This usually makes sense for only private interfaces.
              • However one may call this out for a supposedly public interface to highlight that it should not be used as an interface; for public interfaces, labeling it as Not-an-interface is probably more appropriate than "unstable".
                • Examples of publically visible interfaces that are unstable (ie not-an-interface): GUI, CLIs whose output format will change
            • Deprecated - should not be used, will be removed in the future.

          FAQ

          1. What is the harm in applications using a private interface that is stable? How is it different than a public stable interface?
            While a private interface marked as stable is targeted to change only at major releases, it may break at other times if the providers of that interface are willing to changes the internal users of that interface. Further, a public stable interface is less likely to break even at major releases (even though it is allowed to break compatibility) because the impact of the change is larger. If you use a private interface (regardless of its stability) you run the risk of incompatibility.
          2. Why bother declaring the stability of a private interface?
            • To communicate the intent to its internal users.
            • To provide guidelines to developers of the interface
            • The stability may capture other internal properties of the system
              • e.g In HDFS, NN-DN protocol stability can help implement as rolling upgrades
              • e.g. In HDFS, FSImage stabilty can help provide more flexible roll backs.
          3. How will the classification be recorded for hadoop APIs?
            • Each interface or class will have the scope and stability recorded using javadoc tags, annotation, or some other mechanim. What ever mechanism we choose, the classification must be visisble on the genrated java doc.
            • APIs of private scope will not be part of the "public javadoc generated by ant (ie by the ant target "javadoc"); they will only be generated for the developer javadoc (generated by ant target "javadoc-dev")
            • One can derive the scope of java classes and java interfaces by the scope of the package in which they are contained. Hence it is useful to declare the scope of each java package as public or private (along with the private scope variations).

          Proposed Classification for Hadoop Interfaces

          • Scope Public
            • Stable
              • FileSystem, MapReduce, Config, CLI (inlcuding output), parts of Mapred.lib, Job Logs API, instrumentation metrics. Audit logs
            • Evolving
              • TFile, parts of Mapred.lib, some instrumentation metrics, jmx interface (till it becomes stable),
              • Job logs and job history ( Some tools, scripts and chukwa use this to analyze job processing)
            • Not An interface
              • Web GUI
          • Scope Private
            • Limited-Private Evolving
              • RPC, Metrics (HDFS-MapReduce Private) - once stable, we can consider making these public-stable.
            • Project-Private Stable
              • Intra-HDFS and MR protocols (facilitates rolling upgrades down the road)
              • FSImage
                • Note this will enable old versions of HDFS to read newer fsImage and hence enable more flexible roll backs.
                • Q. Should this be Project-Private Evolving instead?
                • Regardless of the stability of FSImage, new versions of HDFS have to be able to transparently convert older versions and provide roll-back.
            • Project-Private Evolving
              • DFSClient (Q. should this be "project-private unstable"
            • Project-Private Unstable
              • System logs
              • All implementation classes and interfaces not otherwise classified are considered to be project-private stable.
          Show
          Sanjay Radia added a comment - - edited A proposal: Interface Taxonomy - Scope & Stability Classification The interface taxonomy classification provided here is for guidance to developers and users of interfaces. The classification guides a developer to declare the scope (or targeted audience or users) of an interface and also its stability. Benefits to the user of an interface : Knows which interfaces to use or not use and their stability. Benefits to the developer : to prevent accidental changes of interfaces and hence accidental impact on users or other components or system. This is particularly useful in large systems with many developers who may not all have a shared state/history of the project. This classification was derived from a taxonomy used inside Yahoo and from the OpenSolaris taxonomy ( http://www.opensolaris.org/os/community/arc/policies/interface-taxonomy/#Advice ) Interface have two main attributes: Scope and Stability Scope - denotes the potential customers of the interface . For example many interfaces are merely internal or private interfaces of the implementation while others are public or external interfaces that applications or clients are expected to use. In posix, libc is an is an external or public interface, while large parts of the kernel are internal or private interfaces. In addition, some interfaces are targeted to some specific other subsystems. Identifying the scope helps define the customers or users of the interfaces and helps define the impact of breaking an interface. For example we may be willing to break the comaptibility of an interface whose scope is a small number of specific subsystems. One the other hand, one is unlikely to break a protocol interfaces that millions of internet users depend on. The following are useful scopes in order of increasing/wider visibility project-private the interface is for internal use within the project and should not be used by applications. It is subject to change at anytime without notice. Most interfaces of a project are project private. limited-private the interface is used by a specified set of projects or systems (typically closely related projects). Other projects or systems should not use the interface. Changes to the interface will be communicated/negotiated with the specified projects. For example, in the hadoop project, some interfaces are hdfs-mapReduce-private in that they are private to the hdfs and mapReduce projects. company-private ( This not applicable to opensource projects such as Hadoop. It is mentioned here for completeness.) the interface can use used by other projects within a company. public the interface is for general use by any application. Stability - denotes when changes can be made to the interface that break compatibility . Stable Can evolve while retaining compatibility for minor release boundaries.; can break compatibility only at major release (ie. at m.0). Evolving Evolving, but can break compatibility at minor release (i.e. m.x) Unstable This usually makes sense for only private interfaces. However one may call this out for a supposedly public interface to highlight that it should not be used as an interface; for public interfaces, labeling it as Not-an-interface is probably more appropriate than "unstable". Examples of publically visible interfaces that are unstable (ie not-an-interface): GUI, CLIs whose output format will change Deprecated - should not be used, will be removed in the future. FAQ What is the harm in applications using a private interface that is stable? How is it different than a public stable interface? While a private interface marked as stable is targeted to change only at major releases, it may break at other times if the providers of that interface are willing to changes the internal users of that interface. Further, a public stable interface is less likely to break even at major releases (even though it is allowed to break compatibility) because the impact of the change is larger. If you use a private interface (regardless of its stability) you run the risk of incompatibility . Why bother declaring the stability of a private interface? To communicate the intent to its internal users. To provide guidelines to developers of the interface The stability may capture other internal properties of the system e.g In HDFS, NN-DN protocol stability can help implement as rolling upgrades e.g. In HDFS, FSImage stabilty can help provide more flexible roll backs. How will the classification be recorded for hadoop APIs? Each interface or class will have the scope and stability recorded using javadoc tags, annotation, or some other mechanim. What ever mechanism we choose, the classification must be visisble on the genrated java doc. APIs of private scope will not be part of the "public javadoc generated by ant (ie by the ant target "javadoc"); they will only be generated for the developer javadoc (generated by ant target "javadoc-dev") One can derive the scope of java classes and java interfaces by the scope of the package in which they are contained. Hence it is useful to declare the scope of each java package as public or private (along with the private scope variations). Proposed Classification for Hadoop Interfaces Scope Public Stable FileSystem, MapReduce, Config, CLI (inlcuding output), parts of Mapred.lib, Job Logs API, instrumentation metrics. Audit logs Evolving TFile, parts of Mapred.lib, some instrumentation metrics, jmx interface (till it becomes stable), Job logs and job history ( Some tools, scripts and chukwa use this to analyze job processing) Not An interface Web GUI Scope Private Limited-Private Evolving RPC, Metrics (HDFS-MapReduce Private) - once stable, we can consider making these public-stable. Project-Private Stable Intra-HDFS and MR protocols (facilitates rolling upgrades down the road) FSImage Note this will enable old versions of HDFS to read newer fsImage and hence enable more flexible roll backs. Q. Should this be Project-Private Evolving instead? Regardless of the stability of FSImage, new versions of HDFS have to be able to transparently convert older versions and provide roll-back. Project-Private Evolving DFSClient (Q. should this be "project-private unstable" Project-Private Unstable System logs All implementation classes and interfaces not otherwise classified are considered to be project-private stable.
          Hide
          Doug Cutting added a comment -

          This sounds fine.

          The implication on documentation should be something like:

          • Public/stable should be documented.
          • Public/evolving should documented, but must be well-marked as evolving.
          • Private, limited or otherwise, should not be included in public documentation, but only in internal documents.

          The marking for evolving could be as simple as adding "Evolving:" to the beginning of javadoc strings. Or perhaps we can add an annotation and doclet which adds "Evolving" in bold, as is done for deprecations. If evolving and/or deprecated features have Forrest documentation, then that too should include a marking.

          We'd also need to add end-user documentation of this policy, including exactly what "Evolving" and "Deprecated" mean. This should not include any discussion of the private classifications, only our compatibility promises for public stuff, and that anything undocumented has no compatibility promises.

          Does this make sense?

          Show
          Doug Cutting added a comment - This sounds fine. The implication on documentation should be something like: Public/stable should be documented. Public/evolving should documented, but must be well-marked as evolving. Private, limited or otherwise, should not be included in public documentation, but only in internal documents. The marking for evolving could be as simple as adding "Evolving:" to the beginning of javadoc strings. Or perhaps we can add an annotation and doclet which adds "Evolving" in bold , as is done for deprecations. If evolving and/or deprecated features have Forrest documentation, then that too should include a marking. We'd also need to add end-user documentation of this policy, including exactly what "Evolving" and "Deprecated" mean. This should not include any discussion of the private classifications, only our compatibility promises for public stuff, and that anything undocumented has no compatibility promises. Does this make sense?
          Hide
          steve_l added a comment -

          I am +1 in the different scopes, and in favour of using Java 5 annotations to mark this up, rather than relying purely on package scoping to hide things. In the open source world, there's nothing to stop anyone putting their classes in your package to get at internal stuff, leaving you to field the support calls with your product name in all the stack traces.

          I'd suggest that we have an @Stable marker to all interfaces that are considered stable; only things that are marked as stable are considered stable, everything else is inherently unstable/evolving

          One thing I would change, the Web GUI is an interface. if you can access it programmatically as fsck does then it has stability issues too.

          Show
          steve_l added a comment - I am +1 in the different scopes, and in favour of using Java 5 annotations to mark this up, rather than relying purely on package scoping to hide things. In the open source world, there's nothing to stop anyone putting their classes in your package to get at internal stuff, leaving you to field the support calls with your product name in all the stack traces. I'd suggest that we have an @Stable marker to all interfaces that are considered stable; only things that are marked as stable are considered stable, everything else is inherently unstable/evolving One thing I would change, the Web GUI is an interface. if you can access it programmatically as fsck does then it has stability issues too.
          Hide
          Doug Cutting added a comment -

          > only things that are marked as stable are considered stable, everything else is inherently unstable/evolving

          That's easier for developers to maintain, but I think more confusing for users, who might reasonably assume that documented features will be supported going forward unless marked otherwise. Documentation should be maintainable, but not at the expense of clear description. So I still prefer marking APIs that are unstable and leaving stable APIs as the unmarked case. We can use tools like jdiff to help us audit releases for new features that ought to be marked as unstable.

          Show
          Doug Cutting added a comment - > only things that are marked as stable are considered stable, everything else is inherently unstable/evolving That's easier for developers to maintain, but I think more confusing for users, who might reasonably assume that documented features will be supported going forward unless marked otherwise. Documentation should be maintainable, but not at the expense of clear description. So I still prefer marking APIs that are unstable and leaving stable APIs as the unmarked case. We can use tools like jdiff to help us audit releases for new features that ought to be marked as unstable.
          Hide
          Jakob Homan added a comment -

          Annotations have the advantage that they can be included in the javadoc, providing a resource for developers.

          To include an annotation in the javadoc, annotate the annotation with @documented and a runtime retention policy (this policy is required):

          // Example annotation that is included in javadoc
          @Retention(RetentionPolicy.RUNTIME)
          @Documented
          public @interface PublicScope {
          	public enum Scope {STABLE, EVOLVING};
          	
          	Scope scope() default Scope.STABLE;
          }
          

          and then annotate the class normally:

          @PublicScope(scope=PublicScope.Scope.EVOLVING)
          public class Tardis {
          	private int foo;
          

          The generated Javadoc then includes the annotation, right below the direct known subclasses, and is hyperlinked to the annotation definition, which can include clarification of what the different values mean.
          The actual annotating is pretty wordy (and my example wording is poor), so it'll be good to pick out a reasonable naming system.

          Another nice Annotation feature is the ability to specify its target: package, class, field, etc.

          Show
          Jakob Homan added a comment - Annotations have the advantage that they can be included in the javadoc, providing a resource for developers. To include an annotation in the javadoc, annotate the annotation with @documented and a runtime retention policy (this policy is required): // Example annotation that is included in javadoc @Retention(RetentionPolicy.RUNTIME) @Documented public @interface PublicScope { public enum Scope {STABLE, EVOLVING}; Scope scope() default Scope.STABLE; } and then annotate the class normally: @PublicScope(scope=PublicScope.Scope.EVOLVING) public class Tardis { private int foo; The generated Javadoc then includes the annotation, right below the direct known subclasses, and is hyperlinked to the annotation definition, which can include clarification of what the different values mean. The actual annotating is pretty wordy (and my example wording is poor), so it'll be good to pick out a reasonable naming system. Another nice Annotation feature is the ability to specify its target: package, class, field, etc.
          Hide
          Doug Cutting added a comment -

          > Annotations have the advantage that they can be included in the javadoc, providing a resource for developers.

          Annotations and/or taglets may be a great tools to help implement this. Perhaps we can limit generated documentation to only include things whose annotated scope is public. If we cannot, then public declarations that appear in the documentation that are not explicitly noted as evolving must be considered stable, since that's what users might reasonably assume.

          Show
          Doug Cutting added a comment - > Annotations have the advantage that they can be included in the javadoc, providing a resource for developers. Annotations and/or taglets may be a great tools to help implement this. Perhaps we can limit generated documentation to only include things whose annotated scope is public. If we cannot, then public declarations that appear in the documentation that are not explicitly noted as evolving must be considered stable, since that's what users might reasonably assume.
          Hide
          Jakob Homan added a comment -

          If we cannot, then public declarations that appear in the documentation that are not explicitly noted as evolving must be considered stable, since that's what users might reasonably assume.

          As long as we supply the annotation with a default value (such as stable) and ensure that each class and package that should have an annotation do have an annotation, this should happen pretty much automatically.

          Show
          Jakob Homan added a comment - If we cannot, then public declarations that appear in the documentation that are not explicitly noted as evolving must be considered stable, since that's what users might reasonably assume. As long as we supply the annotation with a default value (such as stable) and ensure that each class and package that should have an annotation do have an annotation, this should happen pretty much automatically.
          Hide
          Jakob Homan added a comment -

          Had an offline talk with Sanjay and went over these issues.

          One item that came up was the terminology. Scope may be a confusing term to use as it overloads a standard comp. sci term and may not be clear to new people approach our use of it. Since the goal of the scope classification is to indicate who should use the class, would audience be a more appropriate choice? Similarly with stability, it would probably be good to always refer to that as APIStability, lest our code gets tagged with "Unstable" all over and frightens away anyone looking it over to consider using Hadoop in production. Without proper context, stability and scope may not convey the appropriate meanings.

          I've been playing around with using annotations approach and in general they work nicely. Annotations such as (rough versions without proper documentation):

          package annotations;
          
          import java.lang.annotation.Documented;
          import java.lang.annotation.Inherited;
          import java.lang.annotation.Retention;
          import java.lang.annotation.RetentionPolicy;
          
          @Documented
          @Retention(RetentionPolicy.CLASS)
          @Inherited
          public @interface Audience {
          
            public enum AUDIENCE { 
             /**
             * the interface is for general use by any application.
             */
            Public,
            /**
             * the interface is used by a specified set of projects or systems (typically closely related projects).  
             */
            LimitedPrivate, 
            /**
             * Only for use within application, not for use by general developers
             */
            ProjectPrivate }
            AUDIENCE value();
          }

          and

          package annotations;
          
          import java.lang.annotation.Documented;
          import java.lang.annotation.Inherited;
          
          @Inherited
          @Documented
          public @interface APIStability {
            public enum Stability { Stable, Evolving, Unstable };
            
            Stability value();
          }

          allows for reasonable, very clear documentation of code:

          package eccleston;
          
          import static annotations.Audience.AUDIENCE.Public;
          import annotations.APIStability;
          import static annotations.APIStability.Stability.Evolving;
          import static annotations.APIStability.Stability.Stable;
          import annotations.Audience;
          
          /**
           * Time and Relative Dimensions in Space
           */
          @Audience(Public)
          @APIStability(Stable)
          public class TARDIS {
            public void materialize() { System.out.println("whoop! WHOop! WHOOP!"); }
            public void dematerialize() { System.out.println("WHOOP! WHOop! whoop!"); }
            
            @APIStability(Evolving) // We can mark individual methods as necessary
            public void blendIn() { System.out.println("Now I'm a police box! That'll work."); }
          }

          By using the @Documented meta-annotation, the audience and api-stability annotation appear in the javadoc. This is good:

          eccleston 
          Class Tardis
          
          java.lang.Object
            eccleston.TARDIS
          
          Direct Known Subclasses:
          Type2TARDIS
          
          @Audience(value=Public)
          @APIStability(value=Stable)
          
          public class TARDIS
          extends java.lang.Object
          Time and Relative Dimensions in Space
          

          Unfortunately, there are two issues:

          • Even though it's @Documented and @Inherited, the documentation doesn't get inherited, so even though subclasses of this type do have the @APIStability annotation (you can query for it using reflection), it doesn't show up in their javadoc. This makes it difficult to just mark a class as such-and-such stability or audience and assume that takes care of it down the line. This really seems wrong; certainly not the behavior I expected. I played with writing a doclet to do something like this, but the doclet interface is rather cheesy and doesn't allow you to modify the code information, just read it.
          • Eclipse doesn't seem to show the annotations in its javadoc view, which makes it more difficult for coders who rely on these facilities for first-line identification of code and classes they should and shouldn't use.

          I had hoped we could just use package-info.java to mark a class' audience and api-stability and have it propagate all the way down the line in the documentation, but that doesn't work. In addition to issue one above, the @Inherited meta-annotation can only be applied to classes. As a partial remedy to these issues, we could write a documentation tool to check if new code is using any unstable/private apis and emit a warning, as part of test-patch or such.

          Taglets show up in the javadoc quite boldly, but I can't find a way to propagate them; we'd have to explicitly mark each section we wanted to tag (or write a tool to do it).

          Since we can't just mark a top level class and have it show up all down the line, having default values may not be a good idea after all. It's probably best to explicitly set the audience/api-stability value when the annotation is applied.

          So, the annotation approach may work but it's not as in-your-face as I was hoping for helping coders to immediately see which resources they should or shouldn't use in their development environments.

          Show
          Jakob Homan added a comment - Had an offline talk with Sanjay and went over these issues. One item that came up was the terminology. Scope may be a confusing term to use as it overloads a standard comp. sci term and may not be clear to new people approach our use of it. Since the goal of the scope classification is to indicate who should use the class, would audience be a more appropriate choice? Similarly with stability, it would probably be good to always refer to that as APIStability, lest our code gets tagged with "Unstable" all over and frightens away anyone looking it over to consider using Hadoop in production. Without proper context, stability and scope may not convey the appropriate meanings. I've been playing around with using annotations approach and in general they work nicely. Annotations such as (rough versions without proper documentation): package annotations; import java.lang.annotation.Documented; import java.lang.annotation.Inherited; import java.lang.annotation.Retention; import java.lang.annotation.RetentionPolicy; @Documented @Retention(RetentionPolicy.CLASS) @Inherited public @ interface Audience { public enum AUDIENCE { /** * the interface is for general use by any application. */ Public, /** * the interface is used by a specified set of projects or systems (typically closely related projects). */ LimitedPrivate, /** * Only for use within application, not for use by general developers */ ProjectPrivate } AUDIENCE value(); } and package annotations; import java.lang.annotation.Documented; import java.lang.annotation.Inherited; @Inherited @Documented public @ interface APIStability { public enum Stability { Stable, Evolving, Unstable }; Stability value(); } allows for reasonable, very clear documentation of code: package eccleston; import static annotations.Audience.AUDIENCE.Public; import annotations.APIStability; import static annotations.APIStability.Stability.Evolving; import static annotations.APIStability.Stability.Stable; import annotations.Audience; /** * Time and Relative Dimensions in Space */ @Audience(Public) @APIStability(Stable) public class TARDIS { public void materialize() { System .out.println( "whoop! WHOop! WHOOP!" ); } public void dematerialize() { System .out.println( "WHOOP! WHOop! whoop!" ); } @APIStability(Evolving) // We can mark individual methods as necessary public void blendIn() { System .out.println( "Now I'm a police box! That'll work." ); } } By using the @Documented meta-annotation, the audience and api-stability annotation appear in the javadoc. This is good: eccleston Class Tardis java.lang.Object eccleston.TARDIS Direct Known Subclasses: Type2TARDIS @Audience(value=Public) @APIStability(value=Stable) public class TARDIS extends java.lang.Object Time and Relative Dimensions in Space Unfortunately, there are two issues: Even though it's @Documented and @Inherited, the documentation doesn't get inherited, so even though subclasses of this type do have the @APIStability annotation (you can query for it using reflection), it doesn't show up in their javadoc. This makes it difficult to just mark a class as such-and-such stability or audience and assume that takes care of it down the line. This really seems wrong; certainly not the behavior I expected. I played with writing a doclet to do something like this, but the doclet interface is rather cheesy and doesn't allow you to modify the code information, just read it. Eclipse doesn't seem to show the annotations in its javadoc view, which makes it more difficult for coders who rely on these facilities for first-line identification of code and classes they should and shouldn't use. I had hoped we could just use package-info.java to mark a class' audience and api-stability and have it propagate all the way down the line in the documentation, but that doesn't work. In addition to issue one above, the @Inherited meta-annotation can only be applied to classes. As a partial remedy to these issues, we could write a documentation tool to check if new code is using any unstable/private apis and emit a warning, as part of test-patch or such. Taglets show up in the javadoc quite boldly, but I can't find a way to propagate them; we'd have to explicitly mark each section we wanted to tag (or write a tool to do it). Since we can't just mark a top level class and have it show up all down the line, having default values may not be a good idea after all. It's probably best to explicitly set the audience/api-stability value when the annotation is applied. So, the annotation approach may work but it's not as in-your-face as I was hoping for helping coders to immediately see which resources they should or shouldn't use in their development environments.
          Hide
          Jiaqi Tan added a comment -

          I would like to propose that a particular subset of logging statements currently in Hadoop be made Public-Evolving, and eventually Public-Stable, for the purpose of lightweight tracing for studying Hadoop behavior. This has various applications, such as for diagnosing failures and visualizing behavior, and using log statements that are already being currently generated gives us a nice, low-overhead way for doing this.

          We have been using Hadoop logs for diagnosis for some time now, and while the log statements have been relatively stable, we have had to make minor parser modifications between Hadoop versions, and there is currently no guarantee that the statements we find useful will be here to stay. It would be nice if we can articulate some of the existing log statements that help us in diagnosis, and if those statements can be made part of a public interface. Then, parsing and analysis tools can be produced based on this stable interface.

          This also relates to Chukwa, as far as us implementing our log-based analysis and diagnosis tools as part of the Chukwa project.

          Show
          Jiaqi Tan added a comment - I would like to propose that a particular subset of logging statements currently in Hadoop be made Public-Evolving, and eventually Public-Stable, for the purpose of lightweight tracing for studying Hadoop behavior. This has various applications, such as for diagnosing failures and visualizing behavior, and using log statements that are already being currently generated gives us a nice, low-overhead way for doing this. We have been using Hadoop logs for diagnosis for some time now, and while the log statements have been relatively stable, we have had to make minor parser modifications between Hadoop versions, and there is currently no guarantee that the statements we find useful will be here to stay. It would be nice if we can articulate some of the existing log statements that help us in diagnosis, and if those statements can be made part of a public interface. Then, parsing and analysis tools can be produced based on this stable interface. This also relates to Chukwa, as far as us implementing our log-based analysis and diagnosis tools as part of the Chukwa project.
          Hide
          steve_l added a comment -

          @Jiaqi
          -that's a a separate problem, which is more logs ought to be designed for machine post-mortems, and hence marked as "please don't change". Which is something for a new bugrep. (and yes, I agree with you, stable log output is good)

          Show
          steve_l added a comment - @Jiaqi -that's a a separate problem, which is more logs ought to be designed for machine post-mortems, and hence marked as "please don't change". Which is something for a new bugrep. (and yes, I agree with you, stable log output is good)
          Hide
          Jiaqi Tan added a comment -

          @Steve
          I actually commented about this issue of logs here because logs are mentioned here as part of the interface definition, so I was hoping to get "debugging+profiling log statements" marked as part of the public interface that will eventually become stable. Sure, the actual statements to be marked should be part of a separate JIRA, but I wanted to get this defined in the interface as well.

          Show
          Jiaqi Tan added a comment - @Steve I actually commented about this issue of logs here because logs are mentioned here as part of the interface definition, so I was hoping to get "debugging+profiling log statements" marked as part of the public interface that will eventually become stable. Sure, the actual statements to be marked should be part of a separate JIRA, but I wanted to get this defined in the interface as well.
          Hide
          Sanjay Radia added a comment -

          Most of org.apache.hadoop.security should be

          Limited-Private Evolving

          • Security (HDFS-MapReduce Private)

          We are considering moving to using Jaas more extensively; if we do that some of these interfaces (esp UserGroupInformation) will change.

          Show
          Sanjay Radia added a comment - Most of org.apache.hadoop.security should be Limited-Private Evolving Security (HDFS-MapReduce Private) We are considering moving to using Jaas more extensively; if we do that some of these interfaces (esp UserGroupInformation) will change.
          Hide
          Jakob Homan added a comment -

          Did some work creating a test implementation of the above. Attached as a patch as a prototype of what such a system could look like, along with a screenshot of how the annotations end up looking in the Javadoc. Just a proof of concept, don't ding me over file locations and such.

          It turned out to be better to have separate annotations for each of the possible labels: IDE integration was much better and the values show up more clearly in the Javadoc and IDE integreated Javadoc viewer.

          The biggest problem is that with the project split, these annotations are not carried over and we still have the fact that @Inherited and @Documented don't actually work correctly. So, for instance, even though I annotated FileSystem, none of that carries over to DistributedFileSystem, since it's in a different jar. Therefore, there would be a significant labeling initial and maintenance effort to make sure everything was marked as it should be. Anyway, this is what the above scheme could look like if implemented.

          Show
          Jakob Homan added a comment - Did some work creating a test implementation of the above. Attached as a patch as a prototype of what such a system could look like, along with a screenshot of how the annotations end up looking in the Javadoc. Just a proof of concept, don't ding me over file locations and such. It turned out to be better to have separate annotations for each of the possible labels: IDE integration was much better and the values show up more clearly in the Javadoc and IDE integreated Javadoc viewer. The biggest problem is that with the project split, these annotations are not carried over and we still have the fact that @Inherited and @Documented don't actually work correctly. So, for instance, even though I annotated FileSystem, none of that carries over to DistributedFileSystem, since it's in a different jar. Therefore, there would be a significant labeling initial and maintenance effort to make sure everything was marked as it should be. Anyway, this is what the above scheme could look like if implemented.
          Hide
          Sanjay Radia added a comment -

          Doug said:
          >The implication on documentation should be something like:
          > * Public/stable should be documented.
          > * Public/evolving should documented, but must be well-marked as evolving.
          > * Private, limited or otherwise, should not be included in public documentation, but only in internal documents.

          Agreed, However private/stable and private/evolving also need to be document even though they
          are not visible in the public documentation, only in the developer documentation.
          Developers need to understand which private/internal interfaces are stable since their stability needs to be maintained across releases, For example the client protocols need to be stable even though they are
          private for wire compatibility.

          Show
          Sanjay Radia added a comment - Doug said: >The implication on documentation should be something like: > * Public/stable should be documented. > * Public/evolving should documented, but must be well-marked as evolving. > * Private, limited or otherwise, should not be included in public documentation, but only in internal documents. Agreed, However private/stable and private/evolving also need to be document even though they are not visible in the public documentation, only in the developer documentation. Developers need to understand which private/internal interfaces are stable since their stability needs to be maintained across releases, For example the client protocols need to be stable even though they are private for wire compatibility.
          Hide
          Bill Zeller added a comment -

          If all methods of a subclass are not explicitly documented, I worry that it will become difficult to program against the code directly (without looking at generated javadoc). For example, I imagine needing to trace a method's ancestors or class hierarchy to determine its classification. Would it be too difficult or bloated to add public/stable (etc) to every applicable method?

          Show
          Bill Zeller added a comment - If all methods of a subclass are not explicitly documented, I worry that it will become difficult to program against the code directly (without looking at generated javadoc). For example, I imagine needing to trace a method's ancestors or class hierarchy to determine its classification. Would it be too difficult or bloated to add public/stable (etc) to every applicable method?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > It turned out to be better to have separate annotations for each of the possible labels: IDE integration was much better and the values show up more clearly in the Javadoc and IDE integreated Javadoc viewer.
          Hi Jakob, how are the separated annotations better than a single annotation with a parameter?

          c5073_20090825.patch: I tried a little example. It seems that a single annotation works well.
          c5073_20090825.png: the javadoc screen shot.

          Show
          Tsz Wo Nicholas Sze added a comment - > It turned out to be better to have separate annotations for each of the possible labels: IDE integration was much better and the values show up more clearly in the Javadoc and IDE integreated Javadoc viewer. Hi Jakob, how are the separated annotations better than a single annotation with a parameter? c5073_20090825.patch: I tried a little example. It seems that a single annotation works well. c5073_20090825.png: the javadoc screen shot.
          Hide
          Jakob Homan added a comment -

          Hi Jakob, how are the separated annotations better than a single annotation with a parameter?

          With a single annotation and enums, the text added to an element looks like this:

          @Audience(Audience.Type.PUBLIC)
          public class HelloWorld {
          

          compared to separate annotations, which end up looking like:

          @Audience_Public
          public class HelloWorld {
          

          I consider the latter code easier to grok and visually cleaner than the former. The word Audience is not repeated, the unnecessary word Type is not included, the odd capitalization of the Public enum is not included. Overall, the appearance is that of information meant for humans rather than computers.

          It's possible to create a representation similar to the first example with a single annotation, as with the above TARDIS class example, or continuing with the current example:

          @Audience(PUBLIC)
          public class HelloWorld {
          

          However, this requires static imports of the Type enum, which my Eclipse (Ganymede) had trouble doing in a friendly manner. I ended up having to manually insert them into each file and even then ran into weird auto-complete behavior. Eclipse seems to not have a good understanding of how to deal with static imports. This is what I meant by IDE integration. Also, it was faster to type @A <ctrl-space>P<ctrl-space) and be done with it than it was to deal with separate enums. Keystrokes add up.

          As far as the generated javadoc, I think it's a wash. Both approaches are equally readable.

          All that being said, this is a minor point and I'm not going to spend many of my lobbying points on it.

          Show
          Jakob Homan added a comment - Hi Jakob, how are the separated annotations better than a single annotation with a parameter? With a single annotation and enums, the text added to an element looks like this: @Audience(Audience.Type.PUBLIC) public class HelloWorld { compared to separate annotations, which end up looking like: @Audience_Public public class HelloWorld { I consider the latter code easier to grok and visually cleaner than the former. The word Audience is not repeated, the unnecessary word Type is not included, the odd capitalization of the Public enum is not included. Overall, the appearance is that of information meant for humans rather than computers. It's possible to create a representation similar to the first example with a single annotation, as with the above TARDIS class example, or continuing with the current example: @Audience(PUBLIC) public class HelloWorld { However, this requires static imports of the Type enum, which my Eclipse (Ganymede) had trouble doing in a friendly manner. I ended up having to manually insert them into each file and even then ran into weird auto-complete behavior. Eclipse seems to not have a good understanding of how to deal with static imports. This is what I meant by IDE integration. Also, it was faster to type @A <ctrl-space>P<ctrl-space) and be done with it than it was to deal with separate enums. Keystrokes add up. As far as the generated javadoc, I think it's a wash. Both approaches are equally readable. All that being said, this is a minor point and I'm not going to spend many of my lobbying points on it.
          Hide
          Jakob Homan added a comment -

          @Bill
          Absolutely, it may be a problem. That's why I consider the implementation of @Documented and @Inherited that doesn't actually document the inherited annotations broken. I don't know what the right answer is to this: if we require every method to be thusly documented, maintaining this metainfo becomes a burden and will be ignored by developers. If we don't, end users may end up in the situation Bill describes. At the very least, we should have tooling to either apply/update the documentation, or to warn users if they are using APIs that they probably shouldn't be. I don't believe our jars even ship with the javadoc included.

          Show
          Jakob Homan added a comment - @Bill Absolutely, it may be a problem. That's why I consider the implementation of @Documented and @Inherited that doesn't actually document the inherited annotations broken. I don't know what the right answer is to this: if we require every method to be thusly documented, maintaining this metainfo becomes a burden and will be ignored by developers. If we don't, end users may end up in the situation Bill describes. At the very least, we should have tooling to either apply/update the documentation, or to warn users if they are using APIs that they probably shouldn't be. I don't believe our jars even ship with the javadoc included.
          Hide
          Suresh Srinivas added a comment -

          Another approach to take is to have nested interface as follows:

          public @interface Audience {
            public @interface Public {
            }
            
            public @interface Private {
            }
          }
          

          Code can be annotated then by using @Audience.Public, @Audience.Private

          This nicely groups all the annotations together under Audience, without needing to use the Audience_ prefix for grouping them together.

          Show
          Suresh Srinivas added a comment - Another approach to take is to have nested interface as follows: public @interface Audience { public @interface Public { } public @interface Private { } } Code can be annotated then by using @Audience.Public, @Audience.Private This nicely groups all the annotations together under Audience, without needing to use the Audience_ prefix for grouping them together.
          Hide
          Jakob Homan added a comment -

          Another approach to take is to have nested interface as follows:

          +1. Looks like the best of both worlds. Let's go with that.

          Show
          Jakob Homan added a comment - Another approach to take is to have nested interface as follows: +1. Looks like the best of both worlds. Let's go with that.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > Another approach to take is to have nested interface as follows ...
          This is very creative. However, it does not seem that the nested annotation is designed for this purpose.

          In java.lang.annotation, this is an @Retention annotation, which is very similar to our @Audience annotation. @Retention is declared with a RetentionPolicy value, where RetentionPolicy is an enum. We should do the same.

          Show
          Tsz Wo Nicholas Sze added a comment - > Another approach to take is to have nested interface as follows ... This is very creative. However, it does not seem that the nested annotation is designed for this purpose. In java.lang.annotation, this is an @Retention annotation, which is very similar to our @Audience annotation. @Retention is declared with a RetentionPolicy value, where RetentionPolicy is an enum. We should do the same.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Typo: "this is an" => "there is an"

          Show
          Tsz Wo Nicholas Sze added a comment - Typo: "this is an" => "there is an"
          Hide
          Owen O'Malley added a comment -

          While I agree with Nicholas that the value approach is more correct. I think that Suresh's approach leads to a more readable annotation. It also let's us do things like:

          public @interface Audience {
            public @interface Public {}
            public @interface LimitedPrivate {
              String[] target;
            }
            public @interface Private {}
          }
          

          where limited private things have to list their targets.

          Show
          Owen O'Malley added a comment - While I agree with Nicholas that the value approach is more correct. I think that Suresh's approach leads to a more readable annotation. It also let's us do things like: public @interface Audience { public @interface Public {} public @interface LimitedPrivate { String[] target; } public @interface Private {} } where limited private things have to list their targets.
          Hide
          Jakob Homan added a comment -

          This works fine. The javadoc is as readable as any of the others (attached image).

          @Documented 
          @Inherited
          public @interface Audience {
            @Documented
            public @interface Public {}
            
            @Documented
            public @interface LimitedPrivate {
              String target();
            }
            
            @Documented
            public @interface Private {}
          }
          I don't believe this is an abuse of Annotations, rather it's a creative use of Java's rather cumbersome implementation.  +1
          
          Show
          Jakob Homan added a comment - This works fine. The javadoc is as readable as any of the others (attached image). @Documented @Inherited public @ interface Audience { @Documented public @ interface Public {} @Documented public @ interface LimitedPrivate { String target(); } @Documented public @ interface Private {} } I don't believe this is an abuse of Annotations, rather it's a creative use of Java's rather cumbersome implementation. +1
          Hide
          Owen O'Malley added a comment -

          I think the target is better expressed as a String[] rather than a single string.

          Show
          Owen O'Malley added a comment - I think the target is better expressed as a String[] rather than a single string.
          Hide
          Jakob Homan added a comment -

          I think the target is better expressed as a String[] rather than a single string.

          That actually ends up working quite nicely because, surprisingly, Java takes array literals for this:

          @Audience.Public
          public class SampleClass {
            @Audience.LimitedPrivate(target = {"Pig", "Owl", "Zebra"})
            public class Foo {}
            
            @Audience.Private
            public class Bar {}
          }

          Again, the Javadoc is good (see attachment).

          Show
          Jakob Homan added a comment - I think the target is better expressed as a String[] rather than a single string. That actually ends up working quite nicely because, surprisingly, Java takes array literals for this: @Audience.Public public class SampleClass { @Audience.LimitedPrivate(target = { "Pig" , "Owl" , "Zebra" }) public class Foo {} @Audience.Private public class Bar {} } Again, the Javadoc is good (see attachment).
          Hide
          Tsz Wo Nicholas Sze added a comment -

          It is hard to get nested annotations correct. I still think that Public, LimitedPrivate and Private are values of Audience.

          If we are going to do nested annotations, I suggest to followings:

          • Audience indeed is an annotation container. So it should be a class with a private constructor but not itself an annotation. Otherwise, someone may use @Audience later.
          • We also have to repeat @Documented and @Inherited for each nested annotations.
          • From the doc, it says "In annotations with a single element, the element should be named value, ... ". So the "target" in LimitedPrivate should be renamed to "value".
          public final class Audience {
            private Audience() {}
          
            @Documented
            @Inherited
            public @interface Public {}
          
            @Documented
            @Inherited
            public @interface LimitedPrivate {
              String[] value();
            }
          
            @Documented
            @Inherited
            public @interface Private {}
          }
          
          Show
          Tsz Wo Nicholas Sze added a comment - It is hard to get nested annotations correct. I still think that Public, LimitedPrivate and Private are values of Audience. If we are going to do nested annotations, I suggest to followings: Audience indeed is an annotation container. So it should be a class with a private constructor but not itself an annotation. Otherwise, someone may use @Audience later. We also have to repeat @Documented and @Inherited for each nested annotations. From the doc , it says "In annotations with a single element, the element should be named value, ... ". So the "target" in LimitedPrivate should be renamed to "value". public final class Audience { private Audience() {} @Documented @Inherited public @ interface Public {} @Documented @Inherited public @ interface LimitedPrivate { String [] value(); } @Documented @Inherited public @ interface Private {} }
          Hide
          Jakob Homan added a comment -

          Yes, in the actual patch we would have to repeat Documented and Inherited on each. That doesn't seem to me to be a problem. My only concern with using value is that a term like target is more descriptive of what we're describing - who is the target of this LimitedPrivate section. But again, it's not a big point.
          As for using Audience as a class or annotation, I don't think class is the correct way to go, but I'm not going to argue the point any more.

          Show
          Jakob Homan added a comment - Yes, in the actual patch we would have to repeat Documented and Inherited on each. That doesn't seem to me to be a problem. My only concern with using value is that a term like target is more descriptive of what we're describing - who is the target of this LimitedPrivate section. But again, it's not a big point. As for using Audience as a class or annotation, I don't think class is the correct way to go, but I'm not going to argue the point any more.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          There is a problem of using multiple classes (either nested or non-nested classes). It allows multiple annotations for a single entry, e.g.

          @Audience.Public
          @Audience.Private
          public class HelloWorld {
          }
          

          It is even worst if the annotation is @Inherited because there is no way to remove the inherited annotations in the sub-classes. e.g.

          @Audience.Public
          public class Base {
          }
          
          @Audience.Private
          public class Sub extends Base {
          }
          

          Sub indeed is annotated with both @Audience.Public and @Audience.Private if the annotations are @Inherited and there is no way to remove @Audience.Public from Sub.

          Show
          Tsz Wo Nicholas Sze added a comment - There is a problem of using multiple classes (either nested or non-nested classes). It allows multiple annotations for a single entry, e.g. @Audience.Public @Audience.Private public class HelloWorld { } It is even worst if the annotation is @Inherited because there is no way to remove the inherited annotations in the sub-classes. e.g. @Audience.Public public class Base { } @Audience.Private public class Sub extends Base { } Sub indeed is annotated with both @Audience.Public and @Audience.Private if the annotations are @Inherited and there is no way to remove @Audience.Public from Sub.
          Hide
          Owen O'Malley added a comment -

          In my opinion, we either need to:
          1. Remove @Inherited
          2. Use the parameters to specify them.

          Clearly, Nicholas would like 2. Personally, I think we should do 1. Having the annotations flow down is problematic if it flows from public interfaces into the private implementations....

          Show
          Owen O'Malley added a comment - In my opinion, we either need to: 1. Remove @Inherited 2. Use the parameters to specify them. Clearly, Nicholas would like 2. Personally, I think we should do 1. Having the annotations flow down is problematic if it flows from public interfaces into the private implementations....
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > Clearly, Nicholas would like 2. Personally, I think we should do 1. Having the annotations flow down is problematic if it flows from public interfaces into the private implementations....
          Yes, I prefer 2 but I am open to 1.

          BTW, should the value (or target) of LimitedPrivate be enum? Otherwise, someone may incorrectly annotate something with typos like the following:

          @Audience.LimitedPrivate({"Pi", "Owl", "Zebra"}) //Oops, it is supposed to be used by the Pig project,
                                                           //not the Pi programs.
          public class HelloWorld {
          }
          
          Show
          Tsz Wo Nicholas Sze added a comment - > Clearly, Nicholas would like 2. Personally, I think we should do 1. Having the annotations flow down is problematic if it flows from public interfaces into the private implementations.... Yes, I prefer 2 but I am open to 1. BTW, should the value (or target) of LimitedPrivate be enum? Otherwise, someone may incorrectly annotate something with typos like the following: @Audience.LimitedPrivate({ "Pi" , "Owl" , "Zebra" }) //Oops, it is supposed to be used by the Pig project, //not the Pi programs. public class HelloWorld { }
          Hide
          Sanjay Radia added a comment -

          +1 for jakob's last prposal with @inherited removed. A Audience.Public should not imply the same for its subclasses.
          Also I like Nicholas's enum idea for the LimitedPrivate.

          Thanks for driving this through the many iterations.

          Show
          Sanjay Radia added a comment - +1 for jakob's last prposal with @inherited removed. A Audience.Public should not imply the same for its subclasses. Also I like Nicholas's enum idea for the LimitedPrivate. Thanks for driving this through the many iterations.
          Hide
          Jakob Homan added a comment -

          Implemented the current state of agreement, incorporating some of Nicholas' comments.
          The current form ends up looking something like this:

          @Audience.LimitedPrivate({Project.HDFS, Project.AVRO})
          @ApiStability(Stability.Evolving)
          public abstract class Foo extends Configured implements Closeable {
          

          The ApiStability annotation seems to have flown under the radar in terms of debates of naming, etc. Using the Project enum puts us back to having to a rather bulky syntax, but at this point, I'd just as assume have people do static imports if they really want it to look clean.
          For initial project inclusion, all of the nine Hadoop subprojects are included, per an offline discussion with Owen. The annotations have been placed in a new package, Classification.

          Show
          Jakob Homan added a comment - Implemented the current state of agreement, incorporating some of Nicholas' comments. The current form ends up looking something like this: @Audience.LimitedPrivate({Project.HDFS, Project.AVRO}) @ApiStability(Stability.Evolving) public abstract class Foo extends Configured implements Closeable { The ApiStability annotation seems to have flown under the radar in terms of debates of naming, etc. Using the Project enum puts us back to having to a rather bulky syntax, but at this point, I'd just as assume have people do static imports if they really want it to look clean. For initial project inclusion, all of the nine Hadoop subprojects are included, per an offline discussion with Owen. The annotations have been placed in a new package, Classification.
          Hide
          Sanjay Radia added a comment -

          +1. Looks good.

          Show
          Sanjay Radia added a comment - +1. Looks good.
          Hide
          Owen O'Malley added a comment -

          I'd rather be more symmetric in our terminology. How about:

          @InterfaceAudience.LimitedPrivate({Project.HDFS, Project.AVRO})
          @InterfaceStability.Evolving
          public class Foo ...
          
          Show
          Owen O'Malley added a comment - I'd rather be more symmetric in our terminology. How about: @InterfaceAudience.LimitedPrivate({Project.HDFS, Project.AVRO}) @InterfaceStability.Evolving public class Foo ...
          Hide
          Sanjay Radia added a comment -

          +1 to Owen's suggestion.

          Show
          Sanjay Radia added a comment - +1 to Owen's suggestion.
          Hide
          Robert Chansler added a comment -

          +1 This has converged to a solid solution.

          Show
          Robert Chansler added a comment - +1 This has converged to a solid solution.
          Hide
          Jakob Homan added a comment -

          Updated per Owen's comments.

          Show
          Jakob Homan added a comment - Updated per Owen's comments.
          Hide
          Jakob Homan added a comment -

          No immediate comments, so marking patch available.

          Show
          Jakob Homan added a comment - No immediate comments, so marking patch available.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12418982/COMMON-5073.patch
          against trunk revision 812652.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/25/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/25/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/25/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/25/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12418982/COMMON-5073.patch against trunk revision 812652. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/25/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/25/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/25/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/25/console This message is automatically generated.
          Hide
          Suresh Srinivas added a comment -

          Could you please add more description to InterfaceStability, given that this is will be used for annotating the entire code base?

          Show
          Suresh Srinivas added a comment - Could you please add more description to InterfaceStability, given that this is will be used for annotating the entire code base?
          Hide
          Jakob Homan added a comment -

          Addressed Suresh's comment.

          Show
          Jakob Homan added a comment - Addressed Suresh's comment.
          Hide
          Suresh Srinivas added a comment -

          +1. Thanks for addressing the comment.

          Show
          Suresh Srinivas added a comment - +1. Thanks for addressing the comment.
          Hide
          Suresh Srinivas added a comment -

          I comment the change. Thanks Jakob.

          Show
          Suresh Srinivas added a comment - I comment the change. Thanks Jakob.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #23 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/23/)
          . Add annotation mechanism for interface classification. Contributed by Jakob Homan.

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #23 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/23/ ) . Add annotation mechanism for interface classification. Contributed by Jakob Homan.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk #90 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/90/)
          . Add annotation mechanism for interface classification. Contributed by Jakob Homan.

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk #90 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/90/ ) . Add annotation mechanism for interface classification. Contributed by Jakob Homan.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Mapred/HDFS still cannot use this, as it doesn't look like the common jar after this patch is committed to the sub-projects. Any plans for committing the common jar to mapreduce/hdfs rightaway? Or wait till some other patch does this?

          Show
          Vinod Kumar Vavilapalli added a comment - Mapred/HDFS still cannot use this, as it doesn't look like the common jar after this patch is committed to the sub-projects. Any plans for committing the common jar to mapreduce/hdfs rightaway? Or wait till some other patch does this?
          Hide
          Sanjay Radia added a comment -

          Proposed Classification for Hadoop Interfaces (updated)

          • Scope Public
            • Deprecated
              • FileSystem, mapred
            • Stable
              • FileContext, MapReduce, Config, CLI (inlcuding output), Job Logs API, Audit logs
              • AbstractFileSystem (targeted for implementers of filesystems, not for application use)
              • parts of Mapred.lib (have we decided which parts?)
              • parts of instrumentation metrics (have we decided which parts?)
            • Evolving
              • TFile, jmx interface (till it becomes stable),
              • parts of Mapred.lib (have we decided which parts?)
              • parts of instrumentation metrics (have we decided which parts?)
              • Job logs and job history ( Some tools, scripts and chukwa use this to analyze job processing)
            • Not An interface
              • Web GUI
          • Scope Private
            • Limited-Private Evolving
              • RPC, Metrics API, (HDFS-MapReduce Private) - once stable, we can consider making these public-stable.PIU
              • Security APIs
            • Project-Private Stable
              • Intra-HDFS and MR protocols (facilitates rolling upgrades down the road)
              • Mapreduce's context object implementation class
              • FSImage
                • Note this will enable old versions of HDFS to read newer fsImage and hence enable more flexible roll backs.
                • Q. Should this be Project-Private Evolving instead?
                • Regardless of the stability of FSImage, new versions of HDFS have to be able to transparently convert older versions and provide roll-back.
            • Project-Private Evolving
              • DFSClient (Q. should this be "project-private unstable")
            • Project-Private Unstable
              • System logs
              • All implementation classes and interfaces not otherwise classified are considered to be project-private unstable.
          Show
          Sanjay Radia added a comment - Proposed Classification for Hadoop Interfaces (updated) Scope Public Deprecated FileSystem, mapred Stable FileContext, MapReduce, Config, CLI (inlcuding output), Job Logs API, Audit logs AbstractFileSystem (targeted for implementers of filesystems, not for application use) parts of Mapred.lib (have we decided which parts?) parts of instrumentation metrics (have we decided which parts?) Evolving TFile, jmx interface (till it becomes stable), parts of Mapred.lib (have we decided which parts?) parts of instrumentation metrics (have we decided which parts?) Job logs and job history ( Some tools, scripts and chukwa use this to analyze job processing) Not An interface Web GUI Scope Private Limited-Private Evolving RPC, Metrics API, (HDFS-MapReduce Private) - once stable, we can consider making these public-stable.PIU Security APIs Project-Private Stable Intra-HDFS and MR protocols (facilitates rolling upgrades down the road) Mapreduce's context object implementation class FSImage Note this will enable old versions of HDFS to read newer fsImage and hence enable more flexible roll backs. Q. Should this be Project-Private Evolving instead? Regardless of the stability of FSImage, new versions of HDFS have to be able to transparently convert older versions and provide roll-back. Project-Private Evolving DFSClient (Q. should this be "project-private unstable") Project-Private Unstable System logs All implementation classes and interfaces not otherwise classified are considered to be project-private unstable.
          Hide
          Robert Chansler added a comment -

          Editorial pass over all release notes prior to publication of 0.21.

          Show
          Robert Chansler added a comment - Editorial pass over all release notes prior to publication of 0.21.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-trunk-Commit #3817 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3817/)
          HADOOP-7391 Document Interface Classification from HADOOP-5073 (sanjay Radia) (Revision 1488069)

          Result = SUCCESS
          sradia : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1488069
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/InterfaceClassification.apt.vm
          Show
          Hudson added a comment - Integrated in Hadoop-trunk-Commit #3817 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3817/ ) HADOOP-7391 Document Interface Classification from HADOOP-5073 (sanjay Radia) (Revision 1488069) Result = SUCCESS sradia : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1488069 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/InterfaceClassification.apt.vm
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Yarn-trunk #226 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/226/)
          HADOOP-7391 Document Interface Classification from HADOOP-5073 (sanjay Radia) (Revision 1488069)

          Result = SUCCESS
          sradia : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1488069
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/InterfaceClassification.apt.vm
          Show
          Hudson added a comment - Integrated in Hadoop-Yarn-trunk #226 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/226/ ) HADOOP-7391 Document Interface Classification from HADOOP-5073 (sanjay Radia) (Revision 1488069) Result = SUCCESS sradia : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1488069 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/InterfaceClassification.apt.vm
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #1416 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1416/)
          HADOOP-7391 Document Interface Classification from HADOOP-5073 (sanjay Radia) (Revision 1488069)

          Result = FAILURE
          sradia : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1488069
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/InterfaceClassification.apt.vm
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1416 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1416/ ) HADOOP-7391 Document Interface Classification from HADOOP-5073 (sanjay Radia) (Revision 1488069) Result = FAILURE sradia : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1488069 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/InterfaceClassification.apt.vm
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #1442 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1442/)
          HADOOP-7391 Document Interface Classification from HADOOP-5073 (sanjay Radia) (Revision 1488069)

          Result = SUCCESS
          sradia : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1488069
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/InterfaceClassification.apt.vm
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1442 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1442/ ) HADOOP-7391 Document Interface Classification from HADOOP-5073 (sanjay Radia) (Revision 1488069) Result = SUCCESS sradia : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1488069 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/InterfaceClassification.apt.vm

            People

            • Assignee:
              Jakob Homan
              Reporter:
              Sanjay Radia
            • Votes:
              0 Vote for this issue
              Watchers:
              28 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development