Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-752

Add interface classification stable & scope to HDFS

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0, 0.22.0
    • Fix Version/s: 0.22.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      This jira addresses adding interface classification for the classes in hadoop hdfs, based on the mechanism described in Hadoop-5073.

      1. HDFS-752.rel21.patch
        108 kB
        Suresh Srinivas
      2. HDFS-752.patch
        101 kB
        Suresh Srinivas
      3. HDFS-752.1.patch
        113 kB
        Suresh Srinivas
      4. hdfs.interface.txt
        10 kB
        Suresh Srinivas

        Issue Links

          Activity

          Hide
          Suresh Srinivas added a comment -

          Attached file has details of interface labeling.

          How InterfaceAudience classification was done:

          1. None of the current classes in HDFS are expected to directly used by Application or MapReduce. Hence all the classes are marked as private.

          How InterfaceStability classification was done:

          1. Classes used for inter node communication, FSImage and edits log are marked as stable to ensure backward/forward compatibility between the nodes running different releases.
          2. All the super classes of Stable or Evolving classes, the return values and the parameters in public methods are recursively marked Stable/Evolving.

          How labeling will be done:

          1. Classes that are private and unstable are not labelled.
          2. Classes that are marked as stable will need backward/forward compatibility tests.

          Open issues:

          1. A separate jira is needed to track writing tests for backward/forward compatility of Stable classes.
          2. Labeling clarifies where a class can be used and the stability expectation. But there is no tool available to a user to check if class is being misused. Alternatively instead of tool, we could have separate jars (a jar for application, a jar for internal projects and a jar with all the classes?). Applications use a jar that includes only classes relevant to it, avoiding misuse of classes. tool or separate jars. How this will be resolved will be considered in a separate jira.
          Show
          Suresh Srinivas added a comment - Attached file has details of interface labeling. How InterfaceAudience classification was done: None of the current classes in HDFS are expected to directly used by Application or MapReduce. Hence all the classes are marked as private. How InterfaceStability classification was done: Classes used for inter node communication, FSImage and edits log are marked as stable to ensure backward/forward compatibility between the nodes running different releases. All the super classes of Stable or Evolving classes, the return values and the parameters in public methods are recursively marked Stable/Evolving. How labeling will be done: Classes that are private and unstable are not labelled. Classes that are marked as stable will need backward/forward compatibility tests. Open issues: A separate jira is needed to track writing tests for backward/forward compatility of Stable classes. Labeling clarifies where a class can be used and the stability expectation. But there is no tool available to a user to check if class is being misused. Alternatively instead of tool, we could have separate jars (a jar for application, a jar for internal projects and a jar with all the classes?). Applications use a jar that includes only classes relevant to it, avoiding misuse of classes. tool or separate jars. How this will be resolved will be considered in a separate jira.
          Hide
          Robert Chansler added a comment -

          This would be helpful to folks exploring the new APIs and would be appropriate for 0.21.

          Show
          Robert Chansler added a comment - This would be helpful to folks exploring the new APIs and would be appropriate for 0.21.
          Hide
          Tom White added a comment -

          Is anyone working on this? it would be good to have for 0.21.0, but we could release without it.

          Show
          Tom White added a comment - Is anyone working on this? it would be good to have for 0.21.0, but we could release without it.
          Hide
          Suresh Srinivas added a comment -

          Tom, I will sync up with Sanjay on this and have a patch ready in next two days...

          Show
          Suresh Srinivas added a comment - Tom, I will sync up with Sanjay on this and have a patch ready in next two days...
          Hide
          Suresh Srinivas added a comment -

          In my previous proposal, classes related to internal protocols and the classes used by that were marked private stable and rest were marked private unstable.I want to change the protocol classes classification to private evolving. Until Avro is used to ensure backward compatibility, these classes cannot be marked stable.

          Any class that is not tagged with interface classification is private unstable. Given that, I am not planning to add interface tagging to such classes. Tom and Sanjay let me know what you guys think.

          Show
          Suresh Srinivas added a comment - In my previous proposal, classes related to internal protocols and the classes used by that were marked private stable and rest were marked private unstable.I want to change the protocol classes classification to private evolving . Until Avro is used to ensure backward compatibility, these classes cannot be marked stable. Any class that is not tagged with interface classification is private unstable. Given that, I am not planning to add interface tagging to such classes. Tom and Sanjay let me know what you guys think.
          Hide
          Tom White added a comment -

          In the common JIRA we tagged every class with public Java visibility, so I think it makes sense to do so here too.

          Show
          Tom White added a comment - In the common JIRA we tagged every class with public Java visibility, so I think it makes sense to do so here too.
          Hide
          Suresh Srinivas added a comment -

          When you found a class that was should not have been public, did you change it to private? Did you change contrib classes?

          Show
          Suresh Srinivas added a comment - When you found a class that was should not have been public, did you change it to private? Did you change contrib classes?
          Hide
          Tom White added a comment -

          > When you found a class that was should not have been public, did you change it to private?

          We didn't change Java visibility, we only added annotations (e.g. @InterfaceAudience.Private).

          > Did you change contrib classes?

          No.

          Show
          Tom White added a comment - > When you found a class that was should not have been public, did you change it to private? We didn't change Java visibility, we only added annotations (e.g. @InterfaceAudience.Private). > Did you change contrib classes? No.
          Hide
          Suresh Srinivas added a comment -

          This patch adds interface classification to HDFS classes. Contrib and tests are excluded from this. Approach:

          • All protocol related classes, Implementations of FileSystem and AbstractFileSystem are tagged private and evolving
          • All exceptions exposed in protocol classes are marked private and evolving. These classes should perhaps move to protocol package.
          • All the other public classes are marked private

          Tagged DatanodeDescriptor.BlockTargetPair as private and evolving, since it is used in BlockCommand. This should be moved to protocol directory, to ensure all the classes used in protocols is in protocol packages. Will open a bug to track this.

          Tom, can you please review this.

          Show
          Suresh Srinivas added a comment - This patch adds interface classification to HDFS classes. Contrib and tests are excluded from this. Approach: All protocol related classes, Implementations of FileSystem and AbstractFileSystem are tagged private and evolving All exceptions exposed in protocol classes are marked private and evolving. These classes should perhaps move to protocol package. All the other public classes are marked private Tagged DatanodeDescriptor.BlockTargetPair as private and evolving, since it is used in BlockCommand. This should be moved to protocol directory, to ensure all the classes used in protocols is in protocol packages. Will open a bug to track this. Tom, can you please review this.
          Hide
          Tom White added a comment -

          +1 This looks fine to me.

          A few public classes seem to have been missed, including HdfsConstants, Upgradeable, FSDatasetInterface, Replica, FSDatasetMBean, FSClusterStats, FSInodeInfo, FSNamesystemMBean, DatanodeProtocol, InterDatanodeProtocol, NamenodeProtocol, NamenodeProtocols, NodeRegistration, GSet, LightWeightGSet. Do you want to add annotations to these? Thanks.

          Show
          Tom White added a comment - +1 This looks fine to me. A few public classes seem to have been missed, including HdfsConstants, Upgradeable, FSDatasetInterface, Replica, FSDatasetMBean, FSClusterStats, FSInodeInfo, FSNamesystemMBean, DatanodeProtocol, InterDatanodeProtocol, NamenodeProtocol, NamenodeProtocols, NodeRegistration, GSet, LightWeightGSet. Do you want to add annotations to these? Thanks.
          Hide
          Suresh Srinivas added a comment -

          Good catch... I had missed tagging public interfaces. Attached patch fixes all the interfaces. One addition to your list StorageDirType.

          Show
          Suresh Srinivas added a comment - Good catch... I had missed tagging public interfaces. Attached patch fixes all the interfaces. One addition to your list StorageDirType.
          Hide
          Tom White added a comment -

          +1

          Show
          Tom White added a comment - +1
          Hide
          Suresh Srinivas added a comment -

          Committed the patch to trunk.

          Show
          Suresh Srinivas added a comment - Committed the patch to trunk.
          Hide
          Suresh Srinivas added a comment -

          Patch for branch 0.21

          Show
          Suresh Srinivas added a comment - Patch for branch 0.21
          Hide
          Suresh Srinivas added a comment -

          Tom, should this be committed 0.21 branch?

          Show
          Suresh Srinivas added a comment - Tom, should this be committed 0.21 branch?
          Hide
          Tom White added a comment -

          Yes, please.

          Show
          Tom White added a comment - Yes, please.
          Hide
          Suresh Srinivas added a comment -

          Committed the patch to 0.21 branch.

          Show
          Suresh Srinivas added a comment - Committed the patch to 0.21 branch.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #314 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/314/)

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #314 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/314/ )
          Hide
          Suresh Srinivas added a comment -

          Realized I had posted this comment to another jira. Here is the test I ran before committing the patch to trunk:

          Hudson is stuck. I ran testpatch; here is the result:
          [exec] -1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] -1 tests included. The patch doesn't appear to include any new or modified tests.
          [exec] Please justify why no new tests are needed for this patch.
          [exec] Also please list what manual steps were performed to verify this patch.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
          [exec]
          [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.

          This change just tags the code with interface classification. Hence tests are not included. I also ran unit tests and it ran without failures.

          Show
          Suresh Srinivas added a comment - Realized I had posted this comment to another jira. Here is the test I ran before committing the patch to trunk: Hudson is stuck. I ran testpatch; here is the result: [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no new tests are needed for this patch. [exec] Also please list what manual steps were performed to verify this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. This change just tags the code with interface classification. Hence tests are not included. I also ran unit tests and it ran without failures.

            People

            • Assignee:
              Suresh Srinivas
              Reporter:
              Suresh Srinivas
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development