Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-13714

Tighten up our compatibility guidelines for Hadoop 3

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 2.7.3
    • Fix Version/s: 3.0.0-beta1, 3.1.0
    • Component/s: documentation
    • Labels:
      None

      Description

      Our current compatibility guidelines are incomplete and loose. For many categories, we do not have a policy. It would be nice to actually define those policies so our users know what to expect and the developers know what releases to target their changes.

      1. InterfaceClassification.pdf
        122 kB
        Daniel Templeton
      2. HADOOP-13714.WIP-001.patch
        50 kB
        Daniel Templeton
      3. HADOOP-13714.008.patch
        67 kB
        Daniel Templeton
      4. HADOOP-13714.007.patch
        67 kB
        Daniel Templeton
      5. HADOOP-13714.006.patch
        67 kB
        Daniel Templeton
      6. HADOOP-13714.005.patch
        65 kB
        Daniel Templeton
      7. HADOOP-13714.004.patch
        64 kB
        Daniel Templeton
      8. HADOOP-13714.003.patch
        60 kB
        Daniel Templeton
      9. HADOOP-13714.002.patch
        56 kB
        Daniel Templeton
      10. HADOOP-13714.001.patch
        56 kB
        Daniel Templeton
      11. Compatibility.pdf
        199 kB
        Daniel Templeton

        Issue Links

          Activity

          Hide
          kasha Karthik Kambatla added a comment - - edited

          Created this JIRA so we don't miss it, and assigning to myself. Will not be able to get to this, this month.

          Please feel free to pick this up if you are interested an have cycles to work on this.

          Show
          kasha Karthik Kambatla added a comment - - edited Created this JIRA so we don't miss it, and assigning to myself. Will not be able to get to this, this month. Please feel free to pick this up if you are interested an have cycles to work on this.
          Hide
          andrew.wang Andrew Wang added a comment -

          One major item in if we finish HADOOP-11656 is some guarantees around not breaking the client classpath.

          Show
          andrew.wang Andrew Wang added a comment - One major item in if we finish HADOOP-11656 is some guarantees around not breaking the client classpath.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          what's our policy w.r.t log4j.properties and other configuration metadata?

          Show
          stevel@apache.org Steve Loughran added a comment - what's our policy w.r.t log4j.properties and other configuration metadata?
          Hide
          templedf Daniel Templeton added a comment -

          I'll take it.

          Show
          templedf Daniel Templeton added a comment - I'll take it.
          Hide
          templedf Daniel Templeton added a comment -

          I've been reading up on the history of these guidelines and talking with Karthik Kambatla. Here's a rough idea of the changes I think we should make:

          • Explicitly call out in InterfaceClassification.md when compatible changes are allowed
          • Add an explicit definition of what constitutes a compatible change versus an incompatible change
          • Add audience and stability statements for all categories, e.g. all CLI tools are considered public stable
          • Tighten up language to be comprehensive and crystal clear
          • Be really clear about what we consider to be the domain of users/admins, application/plugin/extension developers, and project developers and how we expect our interfaces to be used

          These are things we may want to additionally address:

          • Support for security mechanisms/protocols, e.g. SSLv3
          • Port assignments
          • Log output--it's an interface!
          • Log4j settings
          • UI plugins
          • JHS/ATS data

          It would be really nice to support this effort by beefing up the JavaDocs such that they can serve the role of specifying semantic behavior. That's obviously out of scope for 3.0.0 given the size of the effort, but it's really hard to talk about semantic compatibility when users are reading source code to determine behavior.

          I'll start work on drafting a patch, and I'll post it for review as soon as it's ready.

          Show
          templedf Daniel Templeton added a comment - I've been reading up on the history of these guidelines and talking with Karthik Kambatla . Here's a rough idea of the changes I think we should make: Explicitly call out in InterfaceClassification.md when compatible changes are allowed Add an explicit definition of what constitutes a compatible change versus an incompatible change Add audience and stability statements for all categories, e.g. all CLI tools are considered public stable Tighten up language to be comprehensive and crystal clear Be really clear about what we consider to be the domain of users/admins, application/plugin/extension developers, and project developers and how we expect our interfaces to be used These are things we may want to additionally address: Support for security mechanisms/protocols, e.g. SSLv3 Port assignments Log output--it's an interface! Log4j settings UI plugins JHS/ATS data It would be really nice to support this effort by beefing up the JavaDocs such that they can serve the role of specifying semantic behavior. That's obviously out of scope for 3.0.0 given the size of the effort, but it's really hard to talk about semantic compatibility when users are reading source code to determine behavior. I'll start work on drafting a patch, and I'll post it for review as soon as it's ready.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          pull in Allen Wittenauer for his opinions here.

          We do have some logs whose output is considered stable: those machine readable ones (HDFS Audit log &c). I'd like the explicit ones to be listed here with all others considered open. We can't control the logging or error messages from downstream code either.

          Exception class: we can change/they may change under us. Biggest issue here is unchecked exceptions we sometimes forget to catch and wrap, once wrapped, well, that's a change.
          Exception text. I reserve the right to improve/change error messages to be more informative

          It would be really nice to support this effort by beefing up the JavaDocs such that they can serve the role of specifying semantic behaviour.

          aah. Potentially the wrong place for this. If you look at the FS Spec, you can see that something a bit higher level works, ish. What that adds is some model of what the operations are manipulating. Without that, well, what semantics are you specifying? Vague english-language concepts?

          Consider also the tests to be our implicit specification: they are what the implementors/maintainers expect the implementation to do. Note there, all too often, we have strings containing expectations about the string values of exception messages.

          Show
          stevel@apache.org Steve Loughran added a comment - pull in Allen Wittenauer for his opinions here. We do have some logs whose output is considered stable: those machine readable ones (HDFS Audit log &c). I'd like the explicit ones to be listed here with all others considered open. We can't control the logging or error messages from downstream code either. Exception class: we can change/they may change under us. Biggest issue here is unchecked exceptions we sometimes forget to catch and wrap, once wrapped, well, that's a change. Exception text. I reserve the right to improve/change error messages to be more informative It would be really nice to support this effort by beefing up the JavaDocs such that they can serve the role of specifying semantic behaviour. aah. Potentially the wrong place for this. If you look at the FS Spec, you can see that something a bit higher level works, ish. What that adds is some model of what the operations are manipulating. Without that, well, what semantics are you specifying? Vague english-language concepts? Consider also the tests to be our implicit specification: they are what the implementors/maintainers expect the implementation to do. Note there, all too often, we have strings containing expectations about the string values of exception messages.
          Hide
          aw Allen Wittenauer added a comment - - edited

          First: HADOOP-11696 (and those are old stats!)

          Second: HADOOP-14333 (most recent, but there are other examples... I'm thinking of Tez calling to YARN's private universe in particular)

          Third: The countless "let's add some edge case feature that 90% of the universe won't use to fsck's default output" issues

          I'm becoming more and more of the opinion that the compatibility guidelines are useless. People who should know better regularly ignore them. People who are supposed to help enforce them regularly look the other way if it benefits them or their company. Core developers read the JavaDocs while the end users read everything else. There is a huge disconnect in our communication.

          To which I say: get rid of major releases. Instead, what would be minor releases now become majors, and micros become minors. Minors are only for security holes. (No really. Documentation updates are not allowed.) This effectively eliminates the need for the vast majority of the compatibility guidelines and would likely allow Hadoop to claim Semantic Versioning.

          (Oh, and on HDFS audit logs? Guess what? Those changed incompatibility at least three times in 2.x. So yeah, I have zero faith in that document that is supposed to protect end users.)

          Show
          aw Allen Wittenauer added a comment - - edited First: HADOOP-11696 (and those are old stats!) Second: HADOOP-14333 (most recent, but there are other examples... I'm thinking of Tez calling to YARN's private universe in particular) Third: The countless "let's add some edge case feature that 90% of the universe won't use to fsck's default output" issues I'm becoming more and more of the opinion that the compatibility guidelines are useless. People who should know better regularly ignore them. People who are supposed to help enforce them regularly look the other way if it benefits them or their company. Core developers read the JavaDocs while the end users read everything else. There is a huge disconnect in our communication. To which I say: get rid of major releases. Instead, what would be minor releases now become majors, and micros become minors. Minors are only for security holes. (No really. Documentation updates are not allowed.) This effectively eliminates the need for the vast majority of the compatibility guidelines and would likely allow Hadoop to claim Semantic Versioning. (Oh, and on HDFS audit logs? Guess what? Those changed incompatibility at least three times in 2.x. So yeah, I have zero faith in that document that is supposed to protect end users.)
          Hide
          templedf Daniel Templeton added a comment -

          Since this patch is likely to result in some discussion, here's a early patch to get things kicked off. In this patch I haven't made any substantive changes. I've only restructured the docs to be clearer. In doing so, though, I've made significant structural changes.

          The next step will be to carefully revisit the policies themselves.

          Show
          templedf Daniel Templeton added a comment - Since this patch is likely to result in some discussion, here's a early patch to get things kicked off. In this patch I haven't made any substantive changes. I've only restructured the docs to be clearer. In doing so, though, I've made significant structural changes. The next step will be to carefully revisit the policies themselves.
          Hide
          templedf Daniel Templeton added a comment -

          We do have some logs whose output is considered stable: those machine readable ones (HDFS Audit log &c). I'd like the explicit ones to be listed here with all others considered open. We can't control the logging or error messages from downstream code either.

          Totally agree.

          Exceptions are covered by the API policy, and exception messages are covered by the log and CLI policies. Do we need a separate statement about exceptions?

          If you look at the FS Spec, you can see that something a bit higher level works, ish.

          The FileSystem spec is indeed very clear, and it would be awesome to have something like that for all of Hadoop's interfaces, but we don't, and I'm not convinced that of our interfaces can be specified that way. I also don't think an end user is not going to read that doc. That's a doc for the developer community. End users read javadoc. Javadoc is also much easier to create than a formal specification. Yeah, it's not as formal as a formal specification, but it's a tradeoff.

          Consider also the tests to be our implicit specification

          That sounds great in theory. Our tests are far from comprehensive, though, and consider the end user. I, as a Hadoop developer, can't figure out what some of our tests are supposed to be doing. Do you really think an end user can? Do you think they're even likely to try? If I want to know what something does, I read the source code for it, not the tests. And there's the issue. Javadocs are cheap compared to writing formal specs or tests, they're the first thing end users are going to look at, and when there are no javadocs end users turn to reading the source code. Once they're reading the source code to determine semantics, we lose the ability to make even small changes without breaking things.

          Of course, this discussion is largely moot, though, because I don't see the community investing enough effort to create comprehensive tests, javadocs, or formal specs any time soon.

          To Allen Wittenauer's point, just writing a doc isn't enough. The developer community needs to understand what compatibility means and be committed to upholding it. Having a clear doc that covers all the bases (we can think of) is a step in the right direction. I don't think we can throw up our hands and declare cross-release compatibility a quaint fiction of the past. Adoption of a platform like Hadoop depends on end users having a reliable and predictable build/integration target.

          By the way, Allen Wittenauer, what did you mean by Semantic Versioning? Looking at http://semver.org/, which would appear to be the canonical source, I see

          Given a version number MAJOR.MINOR.PATCH, increment the:

          1. MAJOR version when you make incompatible API changes,
          2. MINOR version when you add functionality in a backwards-compatible manner, and
          3. PATCH version when you make backwards-compatible bug fixes.
          Additional labels for pre-release and build metadata are available as extensions to the MAJOR.MINOR.PATCH format.

          Isn't that exactly what we already do and what this JIRA is attempting to support? What am I missing?

          Show
          templedf Daniel Templeton added a comment - We do have some logs whose output is considered stable: those machine readable ones (HDFS Audit log &c). I'd like the explicit ones to be listed here with all others considered open. We can't control the logging or error messages from downstream code either. Totally agree. Exceptions are covered by the API policy, and exception messages are covered by the log and CLI policies. Do we need a separate statement about exceptions? If you look at the FS Spec, you can see that something a bit higher level works, ish. The FileSystem spec is indeed very clear, and it would be awesome to have something like that for all of Hadoop's interfaces, but we don't, and I'm not convinced that of our interfaces can be specified that way. I also don't think an end user is not going to read that doc. That's a doc for the developer community. End users read javadoc. Javadoc is also much easier to create than a formal specification. Yeah, it's not as formal as a formal specification, but it's a tradeoff. Consider also the tests to be our implicit specification That sounds great in theory. Our tests are far from comprehensive, though, and consider the end user. I, as a Hadoop developer, can't figure out what some of our tests are supposed to be doing. Do you really think an end user can? Do you think they're even likely to try? If I want to know what something does, I read the source code for it, not the tests. And there's the issue. Javadocs are cheap compared to writing formal specs or tests, they're the first thing end users are going to look at, and when there are no javadocs end users turn to reading the source code. Once they're reading the source code to determine semantics, we lose the ability to make even small changes without breaking things. Of course, this discussion is largely moot, though, because I don't see the community investing enough effort to create comprehensive tests, javadocs, or formal specs any time soon. To Allen Wittenauer 's point, just writing a doc isn't enough. The developer community needs to understand what compatibility means and be committed to upholding it. Having a clear doc that covers all the bases (we can think of) is a step in the right direction. I don't think we can throw up our hands and declare cross-release compatibility a quaint fiction of the past. Adoption of a platform like Hadoop depends on end users having a reliable and predictable build/integration target. By the way, Allen Wittenauer , what did you mean by Semantic Versioning? Looking at http://semver.org/ , which would appear to be the canonical source, I see Given a version number MAJOR.MINOR.PATCH, increment the: 1. MAJOR version when you make incompatible API changes, 2. MINOR version when you add functionality in a backwards-compatible manner, and 3. PATCH version when you make backwards-compatible bug fixes. Additional labels for pre-release and build metadata are available as extensions to the MAJOR.MINOR.PATCH format. Isn't that exactly what we already do and what this JIRA is attempting to support? What am I missing?
          Hide
          aw Allen Wittenauer added a comment -

          Isn't that exactly what we already do

          No. Hadoop has a regular habit of releasing minor versions bumps that regularly break backward compatibility. Lots of examples: HA-NN breaking single NN upgrade scripts with a NOP rather than a failure during finalize, audit log output changes, fsck output changes, breaking apart the hadoop-hdfs jar, NN UI, ... Lots to choose from.

          However: the vote on bumping the Java version from 6 to 7 will stand out as the moment the PMC gave clear indication of where compatibility really stands. That is, by almost all definitions, the exact moment when one wants to change major version numbers. It went from 0.20 -> 1.0 "version numbers are cheap" to "no way this should be 3.0 because our licensing to our customers depends upon the ASF release being 2.x".

          Again: downstream users treat every minor as a major due to our track record. Perception is reality. They are not wrong.

          what this JIRA is attempting to support?

          It might, but to quote you:

          The developer community needs to understand what compatibility means and be committed to upholding it.

          They don't care. If there is a conflict between employer's goals and ASF rule set, the employer's goals wins. It has happened time and time again in Hadoop. V2.6.0 and V2.7.0 stand out has critical releases that really demonstrate this in action.

          Show
          aw Allen Wittenauer added a comment - Isn't that exactly what we already do No. Hadoop has a regular habit of releasing minor versions bumps that regularly break backward compatibility. Lots of examples: HA-NN breaking single NN upgrade scripts with a NOP rather than a failure during finalize, audit log output changes, fsck output changes, breaking apart the hadoop-hdfs jar, NN UI, ... Lots to choose from. However: the vote on bumping the Java version from 6 to 7 will stand out as the moment the PMC gave clear indication of where compatibility really stands. That is, by almost all definitions, the exact moment when one wants to change major version numbers. It went from 0.20 -> 1.0 "version numbers are cheap" to "no way this should be 3.0 because our licensing to our customers depends upon the ASF release being 2.x". Again: downstream users treat every minor as a major due to our track record. Perception is reality. They are not wrong. what this JIRA is attempting to support? It might, but to quote you: The developer community needs to understand what compatibility means and be committed to upholding it. They don't care. If there is a conflict between employer's goals and ASF rule set, the employer's goals wins. It has happened time and time again in Hadoop. V2.6.0 and V2.7.0 stand out has critical releases that really demonstrate this in action.
          Hide
          templedf Daniel Templeton added a comment -

          Fair points. Many of the examples you cited are cases where the guidelines were vague or missing. Call me naïvely optimistic (because I am), but my hope is that if we can clean up those areas and make the guidelines a first-class citizen in the development process, we can make a stronger compatibility promise.

          They don't care.

          I prefer to think that they're not yet properly educated in the importance of compatibility. In the cases I've seen where something was rushed through because a vendor needed to keep a promise, the results were often suboptimal. The first step to reining that kind of behavior in is having and clear and comprehensive policies. That's the point of this JIRA. The next step is making the policies front and center in the development process. There isn't a JIRA for that one yet.

          I don't think we're in disagreement.

          Show
          templedf Daniel Templeton added a comment - Fair points. Many of the examples you cited are cases where the guidelines were vague or missing. Call me naïvely optimistic (because I am), but my hope is that if we can clean up those areas and make the guidelines a first-class citizen in the development process, we can make a stronger compatibility promise. They don't care. I prefer to think that they're not yet properly educated in the importance of compatibility. In the cases I've seen where something was rushed through because a vendor needed to keep a promise, the results were often suboptimal. The first step to reining that kind of behavior in is having and clear and comprehensive policies. That's the point of this JIRA. The next step is making the policies front and center in the development process. There isn't a JIRA for that one yet. I don't think we're in disagreement.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          I care about compatibility; I think everyone does. it's just really hard to achieve.

          regarding Semantic versioning, the problem is "all changes may break things". Even ifs something as minor as changing a string in an exception message, something else could be looking for it and find it breaks. Or subtle differences in performance and concurrency which look like they work, but have adverse consequences downstream. I am very bleak about semantic versioning being viable.

          Now, regarding the specs, the FS spec was written precisely because too much of the FS behaviour was hidden in the HDFS code and nobody had written down all that was happening, including what the exceptions were, and there was nothing clear as to what features were deliberate versus accidental (example: mkdirs -p /a/b/c being atomic. Deliberate? Or accidental side effect of a locking optimisation? And what happens if it is now changed?). Its incomplete (HADOOP-13327) and there's a tendency for new features in HDFS to consider it something unimportant, leading to issues like HADOOP-14365. I understand why (timetable pressure, test centric dev doesn't focus on the specs), but it's frustrating.

          To be fair though: it does get read, it is broadly understood by people, which shows how Python makes a good syntax for specification.

          Where it is limited programatically is that as it isn't something you can use in theorem provers, the way TLA+ can be, I can't use it in some specification of what a committer does, use it to prove that the MR committer V1 and v2 algorithms work, etc. I'm playing with the more rigorous approach in HADOOP-13786, but I know once I have got a TLA+ spec for an object store & its commit algorithm,. nobody is going to review it. We just don't have enough people who play in that area to to the reviewing. Which means it wouldn't get the maintenance either. Python it is, then.

          Show
          stevel@apache.org Steve Loughran added a comment - I care about compatibility; I think everyone does. it's just really hard to achieve. regarding Semantic versioning, the problem is "all changes may break things". Even ifs something as minor as changing a string in an exception message, something else could be looking for it and find it breaks. Or subtle differences in performance and concurrency which look like they work, but have adverse consequences downstream. I am very bleak about semantic versioning being viable. Now, regarding the specs, the FS spec was written precisely because too much of the FS behaviour was hidden in the HDFS code and nobody had written down all that was happening, including what the exceptions were, and there was nothing clear as to what features were deliberate versus accidental (example: mkdirs -p /a/b/c being atomic. Deliberate? Or accidental side effect of a locking optimisation? And what happens if it is now changed?). Its incomplete ( HADOOP-13327 ) and there's a tendency for new features in HDFS to consider it something unimportant, leading to issues like HADOOP-14365 . I understand why (timetable pressure, test centric dev doesn't focus on the specs), but it's frustrating. To be fair though: it does get read, it is broadly understood by people, which shows how Python makes a good syntax for specification. Where it is limited programatically is that as it isn't something you can use in theorem provers, the way TLA+ can be, I can't use it in some specification of what a committer does, use it to prove that the MR committer V1 and v2 algorithms work, etc. I'm playing with the more rigorous approach in HADOOP-13786 , but I know once I have got a TLA+ spec for an object store & its commit algorithm,. nobody is going to review it. We just don't have enough people who play in that area to to the reviewing. Which means it wouldn't get the maintenance either. Python it is, then.
          Hide
          templedf Daniel Templeton added a comment -

          Other than the HDFS audit log, what other output is intended to be machine readable? Metrics, but that's already called out.

          Show
          templedf Daniel Templeton added a comment - Other than the HDFS audit log, what other output is intended to be machine readable? Metrics, but that's already called out.
          Hide
          templedf Daniel Templeton added a comment -

          Looks like there are a bunch of audit loggers:

          • HDFS
          • Node manager
          • Resource manager
          • JHS
          • KMS

          Aside from the audit loggers and FSCK, anything else that's classified as machine readable? What about the HDFS OIV? It spits out XML, which is presumably machine readable.

          Show
          templedf Daniel Templeton added a comment - Looks like there are a bunch of audit loggers: HDFS Node manager Resource manager JHS KMS Aside from the audit loggers and FSCK, anything else that's classified as machine readable? What about the HDFS OIV? It spits out XML, which is presumably machine readable.
          Hide
          aw Allen Wittenauer added a comment -

          All of the output that is spit out by shell commands should be considered as being processed.

          Show
          aw Allen Wittenauer added a comment - All of the output that is spit out by shell commands should be considered as being processed.
          Hide
          templedf Daniel Templeton added a comment -

          Fair enough.

          Show
          templedf Daniel Templeton added a comment - Fair enough.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          assume things like hadoop fs -ls hfds://nn1:/temp will be parsed through some shell script, even if just grep and awk

          This is where windows PowerShell is actually slick: you can chain together more than just text and expect piped things to work.

          Show
          stevel@apache.org Steve Loughran added a comment - assume things like hadoop fs -ls hfds://nn1:/temp will be parsed through some shell script, even if just grep and awk This is where windows PowerShell is actually slick: you can chain together more than just text and expect piped things to work.
          Hide
          aw Allen Wittenauer added a comment -

          FWIW:

          I've been watching a thread in -dev with interest. In it, someone has proposed putting a CLEARLY MARKED incompatible change into a patch release. Not a single person has said anything other than "looks good!". A bit ago, that issue was changed to target the micro release.

          Show
          aw Allen Wittenauer added a comment - FWIW: I've been watching a thread in -dev with interest. In it, someone has proposed putting a CLEARLY MARKED incompatible change into a patch release. Not a single person has said anything other than "looks good!". A bit ago, that issue was changed to target the micro release.
          Hide
          templedf Daniel Templeton added a comment -

          Allen Wittenauer, which -dev list and which JIRA?

          Show
          templedf Daniel Templeton added a comment - Allen Wittenauer , which -dev list and which JIRA?
          Hide
          templedf Daniel Templeton added a comment -

          Here's a roughly complete-ish patch for the compatibility guidelines. Feedback is welcome.

          Show
          templedf Daniel Templeton added a comment - Here's a roughly complete-ish patch for the compatibility guidelines. Feedback is welcome.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          If you are using RFC2119 terms? If so: declare it, use SHOULD, MUST NOT, etc, and things like "developers are free to" MUST BE converted to "developers MAY"

          but when using components from a different module Hadoop developers should behave follow the same guidelines as third-party developers: do not use Private or Limited Private (unless explicitly allowed)

          no this is silly. The whole of hadoop-common is full of private code, yarn co intermixed, things using HDFS config constants where there is no public one, etc. The whole meaning "Private" is "internal use within the same hadoop release", at least to me.

          Now, there is code marked LimitedPrivate("MapReduce"), but those tend to mean ("any YARN application will need these"). So I tread those not as "this is the only code from the other modules which MR MAY use", so much as "things You MUST yse if you want your yarn app to work".

          1. Unless stated, any module in org.apache.hadoop MAY use anything in any other module. That doesn't mean that YARN SHOULD use HDFS internals, but they can if they want to.
          2. native libraries. Need a story there, especially as you can't have >1 version on PATH, and on YARN you might upload an app with an older version of the Java-side of the JNI libs. I'd like to see a definitive "don't delete/rename/change args on existing methods", so a "2.7 hadoop JAR can link to a 2.8 native lib. Or at least a "where possible"
          3. OS. What is the policy for un-supporting an OS? Minor? Major?
          Show
          stevel@apache.org Steve Loughran added a comment - If you are using RFC2119 terms? If so: declare it, use SHOULD, MUST NOT, etc, and things like "developers are free to" MUST BE converted to "developers MAY" but when using components from a different module Hadoop developers should behave follow the same guidelines as third-party developers: do not use Private or Limited Private (unless explicitly allowed) no this is silly. The whole of hadoop-common is full of private code, yarn co intermixed, things using HDFS config constants where there is no public one, etc. The whole meaning "Private" is "internal use within the same hadoop release", at least to me. Now, there is code marked LimitedPrivate("MapReduce"), but those tend to mean ("any YARN application will need these"). So I tread those not as "this is the only code from the other modules which MR MAY use", so much as "things You MUST yse if you want your yarn app to work". Unless stated, any module in org.apache.hadoop MAY use anything in any other module. That doesn't mean that YARN SHOULD use HDFS internals, but they can if they want to. native libraries. Need a story there, especially as you can't have >1 version on PATH, and on YARN you might upload an app with an older version of the Java-side of the JNI libs. I'd like to see a definitive "don't delete/rename/change args on existing methods", so a "2.7 hadoop JAR can link to a 2.8 native lib. Or at least a "where possible" OS. What is the policy for un-supporting an OS? Minor? Major?
          Hide
          andrew.wang Andrew Wang added a comment -

          Ping, next steps here? Beta1 is fast approaching.

          Show
          andrew.wang Andrew Wang added a comment - Ping, next steps here? Beta1 is fast approaching.
          Hide
          templedf Daniel Templeton added a comment -

          After a bit of a hiatus, I'm now resuming work on this JIRA. The plan is to recut what I have already done as a spec for Hadoop developers only. Once we get some agreement on that, I'll extract out a doc for downstream developers and one for end users. As long as we get the main dev spec done by beta1, we should be fine. The other docs will just be a reflection of that.

          Show
          templedf Daniel Templeton added a comment - After a bit of a hiatus, I'm now resuming work on this JIRA. The plan is to recut what I have already done as a spec for Hadoop developers only. Once we get some agreement on that, I'll extract out a doc for downstream developers and one for end users. As long as we get the main dev spec done by beta1, we should be fine. The other docs will just be a reflection of that.
          Hide
          chris.douglas Chris Douglas added a comment -

          The plan is to recut what I have already done as a spec for Hadoop developers only

          When you have a draft, please post it to the dev list in case folks aren't watching this JIRA.

          Show
          chris.douglas Chris Douglas added a comment - The plan is to recut what I have already done as a spec for Hadoop developers only When you have a draft, please post it to the dev list in case folks aren't watching this JIRA.
          Hide
          templedf Daniel Templeton added a comment -

          Chris Douglas, will do.

          Steve Loughran, re:

          no this is silly. The whole of hadoop-common is full of private code, yarn co intermixed, things using HDFS config constants where there is no public one, etc. The whole meaning "Private" is "internal use within the same hadoop release", at least to me.

          If there are places where private interfaces aren't being respected, we should consider making them limited private. We obviously can't just flip a switch and make everything compliant to a new compat spec overnight, but we can draw some lines in the sand and start trying to make things better.

          Show
          templedf Daniel Templeton added a comment - Chris Douglas , will do. Steve Loughran , re: no this is silly. The whole of hadoop-common is full of private code, yarn co intermixed, things using HDFS config constants where there is no public one, etc. The whole meaning "Private" is "internal use within the same hadoop release", at least to me. If there are places where private interfaces aren't being respected, we should consider making them limited private. We obviously can't just flip a switch and make everything compliant to a new compat spec overnight, but we can draw some lines in the sand and start trying to make things better.
          Hide
          ajisakaa Akira Ajisaka added a comment -

          YARN-3254 wants to update the text information exposed via JMX.
          https://issues.apache.org/jira/browse/YARN-3254?focusedCommentId=16084684&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16084684

          Can we loose the rule to modify text fields when needed?

          Show
          ajisakaa Akira Ajisaka added a comment - YARN-3254 wants to update the text information exposed via JMX. https://issues.apache.org/jira/browse/YARN-3254?focusedCommentId=16084684&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16084684 Can we loose the rule to modify text fields when needed?
          Hide
          andrew.wang Andrew Wang added a comment -

          Hi Daniel, is this going to make beta1?

          Show
          andrew.wang Andrew Wang added a comment - Hi Daniel, is this going to make beta1?
          Hide
          templedf Daniel Templeton added a comment -

          Akira Ajisaka, it looks to me like YARN-3254 only adds fields. I didn't see that it removed or modified an existing field. That's a compatible change.

          Show
          templedf Daniel Templeton added a comment - Akira Ajisaka , it looks to me like YARN-3254 only adds fields. I didn't see that it removed or modified an existing field. That's a compatible change.
          Hide
          templedf Daniel Templeton added a comment -

          Funny you should ask.

          From my perspective, the developer spec is the only artifact that is required for beta1. The end user and downstream docs will be projections of the developer spec into friendlier and more focused documentation.

          This patch should be more or less complete and consistent. Please review it (cc: Karthik Kambatla, [Chris Douglas, Steve Loughran) to make sure I'm not over- or understepping any boundaries, and to make sure it's clear and sufficient.

          One area that is clearly lacking is the section on build artifacts. We should include a list of JARs we expect to be consumed by end users.

          After I've gotten confirmation that I'm not entirely off base, I'll shoot an email to the dev list.

          Show
          templedf Daniel Templeton added a comment - Funny you should ask. From my perspective, the developer spec is the only artifact that is required for beta1. The end user and downstream docs will be projections of the developer spec into friendlier and more focused documentation. This patch should be more or less complete and consistent. Please review it (cc: Karthik Kambatla , [ Chris Douglas , Steve Loughran ) to make sure I'm not over- or understepping any boundaries, and to make sure it's clear and sufficient. One area that is clearly lacking is the section on build artifacts. We should include a list of JARs we expect to be consumed by end users. After I've gotten confirmation that I'm not entirely off base, I'll shoot an email to the dev list.
          Hide
          templedf Daniel Templeton added a comment -

          Here's an updated patch that handles the build artifacts better and adds more links to the interface doc.

          Show
          templedf Daniel Templeton added a comment - Here's an updated patch that handles the build artifacts better and adds more links to the interface doc.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 19s Docker mode activated.
                Prechecks
          +1 @author 0m 0s The patch does not contain any @author tags.
                trunk Compile Tests
          +1 mvninstall 16m 39s trunk passed
          +1 mvnsite 1m 12s trunk passed
                Patch Compile Tests
          +1 mvnsite 1m 2s the patch passed
          -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
                Other Tests
          +1 asflicense 0m 15s The patch does not generate ASF License warnings.
          19m 56s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:71bbb86
          JIRA Issue HADOOP-13714
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12885269/HADOOP-13714.003.patch
          Optional Tests asflicense mvnsite
          uname Linux 6caf0f387300 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 11:05:26 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / ef87d34
          whitespace https://builds.apache.org/job/PreCommit-HADOOP-Build/13160/artifact/patchprocess/whitespace-eol.txt
          modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/13160/console
          Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 19s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags.       trunk Compile Tests +1 mvninstall 16m 39s trunk passed +1 mvnsite 1m 12s trunk passed       Patch Compile Tests +1 mvnsite 1m 2s the patch passed -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply       Other Tests +1 asflicense 0m 15s The patch does not generate ASF License warnings. 19m 56s Subsystem Report/Notes Docker Image:yetus/hadoop:71bbb86 JIRA Issue HADOOP-13714 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12885269/HADOOP-13714.003.patch Optional Tests asflicense mvnsite uname Linux 6caf0f387300 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 11:05:26 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / ef87d34 whitespace https://builds.apache.org/job/PreCommit-HADOOP-Build/13160/artifact/patchprocess/whitespace-eol.txt modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/13160/console Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          I don't want to encourage things in, say, hadoop-common being marked as Public just because I want to use them in, say hadoop-hdfs. And limited private

          {mapreduce}

          , well, we all knows that means "any yarn app". FWIW, Scala's ability to limit scope to part of the package tree is nice, though It just forces me to put things into the org.apache.spark package tree when I need to

          Show
          stevel@apache.org Steve Loughran added a comment - I don't want to encourage things in, say, hadoop-common being marked as Public just because I want to use them in, say hadoop-hdfs. And limited private {mapreduce} , well, we all knows that means "any yarn app". FWIW, Scala's ability to limit scope to part of the package tree is nice, though It just forces me to put things into the org.apache.spark package tree when I need to
          Hide
          templedf Daniel Templeton added a comment -

          I hear ya.

          I don't want to encourage things in, say, hadoop-common being marked as Public just because I want to use them in, say hadoop-hdfs.

          If there are Private things in common that are needed from HDFS, then they should be extended to Limited Private (HDFS). If all of common ends up as Limited Private (HDFS, MapReduce, YARN), that's fine.

          And limited private {mapreduce}, well, we all knows that means "any yarn app".

          Then those thing need to be public. If we have interfaces that are consumed by the public but that we don't have labeled in a way that enforces compatibility, that's bad.

          Show
          templedf Daniel Templeton added a comment - I hear ya. I don't want to encourage things in, say, hadoop-common being marked as Public just because I want to use them in, say hadoop-hdfs. If there are Private things in common that are needed from HDFS, then they should be extended to Limited Private (HDFS). If all of common ends up as Limited Private (HDFS, MapReduce, YARN), that's fine. And limited private {mapreduce}, well, we all knows that means "any yarn app". Then those thing need to be public. If we have interfaces that are consumed by the public but that we don't have labeled in a way that enforces compatibility, that's bad.
          Hide
          templedf Daniel Templeton added a comment -

          I should also add that I don't expect everything thing to magically get better overnight. First we make some rules; then we fix the places where the code breaks the rules as we find them. Over time, the annotations should come to match the use.

          Show
          templedf Daniel Templeton added a comment - I should also add that I don't expect everything thing to magically get better overnight. First we make some rules; then we fix the places where the code breaks the rules as we find them. Over time, the annotations should come to match the use.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          Ok, here's another way to look at it. What do you believe hadoop common for, if not a common library for the other bits of Hadoop?

          Show
          stevel@apache.org Steve Loughran added a comment - Ok, here's another way to look at it. What do you believe hadoop common for, if not a common library for the other bits of Hadoop?
          Hide
          stevel@apache.org Steve Loughran added a comment -

          If we have interfaces that are consumed by the public but that we don't have labeled in a way that enforces compatibility, that's bad.

          I suggest you take a look at the DistributedShell example and its uses of hadoop code....

          Show
          stevel@apache.org Steve Loughran added a comment - If we have interfaces that are consumed by the public but that we don't have labeled in a way that enforces compatibility, that's bad. I suggest you take a look at the DistributedShell example and its uses of hadoop code....
          Hide
          templedf Daniel Templeton added a comment -

          Maybe I'm missing your point. Hadoop common is absolutely the common bits for use by Hadoop, and hence Limited Private (HDFS, MapReduce, YARN, Common) seems quite reasonable. Is there are reason it isn't?

          The point I'm trying to make is that the reason we want to set and uphold rules around audience and stability is so that we can have a clear and sustainable contract with the consumers of our interfaces. If we have things that are labeled Limited Private (MapReduce) that everybody just knows are really Public, then there's nothing to stop someone who doesn't just know from breaking those APIs and all the downstream consumers. If we make sure the Public interfaces are actually labeled as Public, then we can catch the breakage before it happens.

          Show
          templedf Daniel Templeton added a comment - Maybe I'm missing your point. Hadoop common is absolutely the common bits for use by Hadoop, and hence Limited Private (HDFS, MapReduce, YARN, Common) seems quite reasonable. Is there are reason it isn't? The point I'm trying to make is that the reason we want to set and uphold rules around audience and stability is so that we can have a clear and sustainable contract with the consumers of our interfaces. If we have things that are labeled Limited Private (MapReduce) that everybody just knows are really Public, then there's nothing to stop someone who doesn't just know from breaking those APIs and all the downstream consumers. If we make sure the Public interfaces are actually labeled as Public, then we can catch the breakage before it happens.
          Hide
          templedf Daniel Templeton added a comment -

          I've updated the guidelines to specifically address metadata versioning. I've also called out compatible and incompatible changes to audience and change compatibility annotations in the interface classification doc. Let me know what else I need to fix, adjust, or include.

          Show
          templedf Daniel Templeton added a comment - I've updated the guidelines to specifically address metadata versioning. I've also called out compatible and incompatible changes to audience and change compatibility annotations in the interface classification doc. Let me know what else I need to fix, adjust, or include.
          Hide
          miklos.szegedi@cloudera.com Miklos Szegedi added a comment -

          Daniel Templeton, thank you for working on this. I read the document and I have a few comments and questions.
          It would be helpful to have a shorter version or summary at the beginning for general audience.
          It could make sense to add a glossary. It might not be obvious to everyone for example what is downstream.

          A package, class, or member variable or method that is not annotated SHALL be interpreted as implicitly Private and Unstable.

          In cases where no classifications are present, the protocols SHOULD be assumed to be Private and Stable.

          Please help me understand the difference between protocols and code, why that they are treated so much differently. I would assume a code that is not marked as unstable should be stable.

          Each API has an API-specific version number. Any incompatible changes MUST increment the API version number.

          I would also elaborate, whether the old API versions should be supported or not. In what circumstances can a REST API version be deprecated or removed.

          All audit log output SHALL be considered Public and Stable. Any change to the data format SHALL be considered an incompatible change.

          Since this is an audit log, I am assuming a security requirement may override this requirement. Let’s say a certain event was not logged before but it should be.

          The environment variables consumed by Hadoop and the environment variables made accessible to applications through YARN SHALL be considered Public and Evolving. The developer community SHOULD limit changes to major releases.

          Just a note. Hadoop configuration is public and stable. Do environment variables passed to the AM need to be evolving?

          The JVM requirements SHALL NOT change across minor releases within the same major release unless the JVM version in question becomes unsupported. The JVM version requirement MAY be different for different operating systems or even operating system releases.

          Regarding this let’s call out C compiler and and runtime requirements. It might be frustrating not to be able to compile a new minor version of Hadoop with a security fix on an existing tested configuration.

          Show
          miklos.szegedi@cloudera.com Miklos Szegedi added a comment - Daniel Templeton , thank you for working on this. I read the document and I have a few comments and questions. It would be helpful to have a shorter version or summary at the beginning for general audience. It could make sense to add a glossary. It might not be obvious to everyone for example what is downstream. A package, class, or member variable or method that is not annotated SHALL be interpreted as implicitly Private and Unstable. In cases where no classifications are present, the protocols SHOULD be assumed to be Private and Stable. Please help me understand the difference between protocols and code, why that they are treated so much differently. I would assume a code that is not marked as unstable should be stable. Each API has an API-specific version number. Any incompatible changes MUST increment the API version number. I would also elaborate, whether the old API versions should be supported or not. In what circumstances can a REST API version be deprecated or removed. All audit log output SHALL be considered Public and Stable. Any change to the data format SHALL be considered an incompatible change. Since this is an audit log, I am assuming a security requirement may override this requirement. Let’s say a certain event was not logged before but it should be. The environment variables consumed by Hadoop and the environment variables made accessible to applications through YARN SHALL be considered Public and Evolving. The developer community SHOULD limit changes to major releases. Just a note. Hadoop configuration is public and stable. Do environment variables passed to the AM need to be evolving? The JVM requirements SHALL NOT change across minor releases within the same major release unless the JVM version in question becomes unsupported. The JVM version requirement MAY be different for different operating systems or even operating system releases. Regarding this let’s call out C compiler and and runtime requirements. It might be frustrating not to be able to compile a new minor version of Hadoop with a security fix on an existing tested configuration.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          I would assume a code that is not marked as unstable should be stable.

          heh. It's private, so nominally "ours to play with". In reality, things like UGI have long been public APIs, so caution is needed changing even those things. If was mandatory to mark unstable code as @Unstable, then all which would happen is a very large patch would mark everything as uch.

          protocol vs code

          we can set things up so that Hadoop 2.8.0 JARs & the 2.8.2 JARs are internally consistent even if you cant mix hadoop-common 2.8.0 and hadoo-hdfs-2.8.2. What we can't do is expect everyone to upgrade to Hadoop 2.8.2 everywhere simultaneously. Hence: protocols considered more sensitive to change

          Audit logs

          Let’s say a certain event was not logged before but it should be.

          The goal here is that the format remains parseable, even if a new event is added. Maybe that should be explicit. Example: HDFS audit log.

          Regarding this let’s call out C compiler and and runtime requirements.

          +1

          Show
          stevel@apache.org Steve Loughran added a comment - I would assume a code that is not marked as unstable should be stable. heh. It's private, so nominally "ours to play with". In reality, things like UGI have long been public APIs, so caution is needed changing even those things. If was mandatory to mark unstable code as @Unstable, then all which would happen is a very large patch would mark everything as uch. protocol vs code we can set things up so that Hadoop 2.8.0 JARs & the 2.8.2 JARs are internally consistent even if you cant mix hadoop-common 2.8.0 and hadoo-hdfs-2.8.2. What we can't do is expect everyone to upgrade to Hadoop 2.8.2 everywhere simultaneously. Hence: protocols considered more sensitive to change Audit logs Let’s say a certain event was not logged before but it should be. The goal here is that the format remains parseable, even if a new event is added. Maybe that should be explicit. Example: HDFS audit log. Regarding this let’s call out C compiler and and runtime requirements. +1
          Hide
          miklos.szegedi@cloudera.com Miklos Szegedi added a comment -

          Thank you, Steve Loughran for the explanations.

          Show
          miklos.szegedi@cloudera.com Miklos Szegedi added a comment - Thank you, Steve Loughran for the explanations.
          Hide
          templedf Daniel Templeton added a comment -

          I added explicit treatment for deprecation in general and clarified the audit log section. I need help on specifying the C compiler and runtime. Since there isn't just one C compiler that people could use, how can we set any limits? Does Hadoop require gcc? By runtime you mean libc et al?

          Show
          templedf Daniel Templeton added a comment - I added explicit treatment for deprecation in general and clarified the audit log section. I need help on specifying the C compiler and runtime. Since there isn't just one C compiler that people could use, how can we set any limits? Does Hadoop require gcc? By runtime you mean libc et al?
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 16s Docker mode activated.
                Prechecks
          +1 @author 0m 0s The patch does not contain any @author tags.
                trunk Compile Tests
          +1 mvninstall 13m 49s trunk passed
          +1 mvnsite 1m 6s trunk passed
                Patch Compile Tests
          +1 mvnsite 0m 57s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
                Other Tests
          +1 asflicense 0m 15s The patch does not generate ASF License warnings.
          16m 49s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:71bbb86
          JIRA Issue HADOOP-13714
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12887123/HADOOP-13714.005.patch
          Optional Tests asflicense mvnsite
          uname Linux 645e54c9eb57 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 11:05:26 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / b9465bb
          modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/13289/console
          Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 16s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags.       trunk Compile Tests +1 mvninstall 13m 49s trunk passed +1 mvnsite 1m 6s trunk passed       Patch Compile Tests +1 mvnsite 0m 57s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues.       Other Tests +1 asflicense 0m 15s The patch does not generate ASF License warnings. 16m 49s Subsystem Report/Notes Docker Image:yetus/hadoop:71bbb86 JIRA Issue HADOOP-13714 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12887123/HADOOP-13714.005.patch Optional Tests asflicense mvnsite uname Linux 645e54c9eb57 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 11:05:26 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / b9465bb modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/13289/console Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          miklos.szegedi@cloudera.com Miklos Szegedi added a comment - - edited

          Daniel Templeton What I suggest is to call out that minor versions should not require new versions of native dependencies like cmake, gcc, g++, zlib and ssl. Probably I would not call out them specifically since your document is general, it is up to you. I would mention native dependencies.

          Show
          miklos.szegedi@cloudera.com Miklos Szegedi added a comment - - edited Daniel Templeton What I suggest is to call out that minor versions should not require new versions of native dependencies like cmake, gcc, g++, zlib and ssl. Probably I would not call out them specifically since your document is general, it is up to you. I would mention native dependencies.
          Hide
          rkanter Robert Kanter added a comment - - edited

          Thanks for putting all this work into writing this up Daniel Templeton. Here's some feedback/questions:

          1. In addition to compatibility of the protocols themselves, maintaining
            cross-version communications requires that the transports supported also be
            stable. The most likely source of transport changes stems from secure
            transports, such as SSL. Upgrading a service from SSLv2 to SSLv3 may break
            existing SSLv2 clients. Supported transports MUST continue to be supported
            across all minor releases within a major version.

            What should we do then if a severe security bug is found with some version of SSL? I don't think we'd want to keep that version of SSL, right? Perhaps some sort of exception should be mentioned for security issues?

          2. Some user applications built against Hadoop might all Hadoop JAR files
            (including Hadoop's library dependencies) to the application's classpath.

            I think a word is missing somewhere in here: "... might [add] all ..."

          3. Adding new dependencies or updating the versions of existing dependencies may
            interfere with those in applications' classpaths and hence their correct
            operation. Users are therefore discouraged from adopting this practice.

            Hadoop dependencies SHALL be considered
            [Private](./InterfaceClassification.html#Private) and
            [Unstable](./InterfaceClassification.html#Unstable).

            While this is great for us, I'm not sure we can do that until we have third party dependencies shaded at least in the clients. For example, if Hive includes yarn-client to talk to Yarn, yarn-client will pull in some transitive dependencies. Do we expect users and downstream projects to always exclude these? And if they do, yarn-client still depends on those dependencies, so it probably will fail without them. Perhaps we need to differentiate between client and server parts of Hadoop? For instance, client dependencies could be Public Evolving but server dependencies could be Private Unstable.

          4. A Stable
            interface is expected to not change incompatibly within a major release and so
            if a safe development target.

            Typo: "... and so i[s] a safe ..."

          5. This may seem like a silly question, but both InterfaceAudience and InterfaceStability are currently marked as Public Evolving. Should we make them Public Stable? We're not going to change these incompatibly within a major release.
          Show
          rkanter Robert Kanter added a comment - - edited Thanks for putting all this work into writing this up Daniel Templeton . Here's some feedback/questions: In addition to compatibility of the protocols themselves, maintaining cross-version communications requires that the transports supported also be stable. The most likely source of transport changes stems from secure transports, such as SSL. Upgrading a service from SSLv2 to SSLv3 may break existing SSLv2 clients. Supported transports MUST continue to be supported across all minor releases within a major version. What should we do then if a severe security bug is found with some version of SSL? I don't think we'd want to keep that version of SSL, right? Perhaps some sort of exception should be mentioned for security issues? Some user applications built against Hadoop might all Hadoop JAR files (including Hadoop's library dependencies) to the application's classpath. I think a word is missing somewhere in here: "... might [add] all ..." Adding new dependencies or updating the versions of existing dependencies may interfere with those in applications' classpaths and hence their correct operation. Users are therefore discouraged from adopting this practice. Hadoop dependencies SHALL be considered [Private](./InterfaceClassification.html#Private) and [Unstable](./InterfaceClassification.html#Unstable). While this is great for us, I'm not sure we can do that until we have third party dependencies shaded at least in the clients. For example, if Hive includes yarn-client to talk to Yarn, yarn-client will pull in some transitive dependencies. Do we expect users and downstream projects to always exclude these? And if they do, yarn-client still depends on those dependencies, so it probably will fail without them. Perhaps we need to differentiate between client and server parts of Hadoop? For instance, client dependencies could be Public Evolving but server dependencies could be Private Unstable. A Stable interface is expected to not change incompatibly within a major release and so if a safe development target. Typo: "... and so i[s] a safe ..." This may seem like a silly question, but both InterfaceAudience and InterfaceStability are currently marked as Public Evolving. Should we make them Public Stable? We're not going to change these incompatibly within a major release.
          Hide
          stevel@apache.org Steve Loughran added a comment -
          • interface declaration stability. Do we plan to change them. We could have a @Public.RemovedInFuture versions, perhaps.
          • transitive dependencies. We already talk a lot about this, but cannot guarantee that things won't change. Maybe: shaded client jar only one with guarantees
          • transitive protocol dependencies. Say ZK changed its wire format? Then the ZK JAR would need an update, and, even shaded, would imply the need to work with an updated ZK service. Same for S3 auth mechanisms, kerberos, ...

          Other fun issues

          • when would OS version support be removed? That's full OS (windows) as well as variants (32-bit x86)
          • when would we cut a filesystem (s3n and s3n are going from hadoop-aws in 3.0)? . Maybe: major, unless there's a migration path to a successor client (which can involve: switching to an external implementation).
          Show
          stevel@apache.org Steve Loughran added a comment - interface declaration stability. Do we plan to change them. We could have a @Public.RemovedInFuture versions, perhaps. transitive dependencies. We already talk a lot about this, but cannot guarantee that things won't change. Maybe: shaded client jar only one with guarantees transitive protocol dependencies. Say ZK changed its wire format? Then the ZK JAR would need an update, and, even shaded, would imply the need to work with an updated ZK service. Same for S3 auth mechanisms, kerberos, ... Other fun issues when would OS version support be removed? That's full OS (windows) as well as variants (32-bit x86) when would we cut a filesystem (s3n and s3n are going from hadoop-aws in 3.0)? . Maybe: major, unless there's a migration path to a successor client (which can involve: switching to an external implementation).
          Hide
          templedf Daniel Templeton added a comment -

          Robert Kanter:

          1. I've adjusted that policy to only address the minimum required major version. I've also added a section at the bottom to address exceptions.
          2. Fixed.
          3. Fair point. I wrote that assuming that classpath isolation would make it into 3.0. I've updated it to public stable for exposed dependencies and included some notes about shading and what defines incompatible.
          4. Fixed.
          5. We should file a JIRA to update them. There are a large number of interfaces that need to be updated is that way.

          Steve Loughran:

          1. Is the question whether we plan to change the interface declaration stability annotations themselves? I would agree with Robert Kanter that they should be stable. If we end up wanting to remove one, we can deprecate it just like anything else.
          2. See if the new text makes sense to you.
          3. I added some text that says we will treat transitive protocols like internal protocols.
          4. OS is already addressed:

            The community SHOULD maintain the same minimum OS requirements (OS kernel versions) within a minor release. Currently GNU/Linux and Microsoft Windows are the OSes officially supported by the community, while Apache Hadoop is known to work reasonably well on other OSes such as Apple MacOSX, Solaris, etc.

          5. Added, though I'm not exactly clear what a you mean by a successor client.
          Show
          templedf Daniel Templeton added a comment - Robert Kanter : I've adjusted that policy to only address the minimum required major version. I've also added a section at the bottom to address exceptions. Fixed. Fair point. I wrote that assuming that classpath isolation would make it into 3.0. I've updated it to public stable for exposed dependencies and included some notes about shading and what defines incompatible. Fixed. We should file a JIRA to update them. There are a large number of interfaces that need to be updated is that way. Steve Loughran : Is the question whether we plan to change the interface declaration stability annotations themselves? I would agree with Robert Kanter that they should be stable. If we end up wanting to remove one, we can deprecate it just like anything else. See if the new text makes sense to you. I added some text that says we will treat transitive protocols like internal protocols. OS is already addressed: The community SHOULD maintain the same minimum OS requirements (OS kernel versions) within a minor release. Currently GNU/Linux and Microsoft Windows are the OSes officially supported by the community, while Apache Hadoop is known to work reasonably well on other OSes such as Apple MacOSX, Solaris, etc. Added, though I'm not exactly clear what a you mean by a successor client.
          Hide
          templedf Daniel Templeton added a comment -

          Oh, I also added some text to address native dependencies.

          Show
          templedf Daniel Templeton added a comment - Oh, I also added some text to address native dependencies.
          Hide
          miklos.szegedi@cloudera.com Miklos Szegedi added a comment -

          Thank you, Daniel Templeton. I see a small typo: "container executer" in the latest version.

          Show
          miklos.szegedi@cloudera.com Miklos Szegedi added a comment - Thank you, Daniel Templeton . I see a small typo: "container executer" in the latest version.
          Hide
          templedf Daniel Templeton added a comment -

          Fixed typo.

          Show
          templedf Daniel Templeton added a comment - Fixed typo.
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 16s Docker mode activated.
                Prechecks
          +1 @author 0m 0s The patch does not contain any @author tags.
                trunk Compile Tests
          +1 mvninstall 13m 42s trunk passed
          +1 mvnsite 0m 57s trunk passed
                Patch Compile Tests
          +1 mvnsite 0m 49s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
                Other Tests
          +1 asflicense 0m 14s The patch does not generate ASF License warnings.
          16m 24s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:71bbb86
          JIRA Issue HADOOP-13714
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12887413/HADOOP-13714.007.patch
          Optional Tests asflicense mvnsite
          uname Linux 29b45af2a324 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 11:05:26 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / fbe06b5
          modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/13301/console
          Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 16s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags.       trunk Compile Tests +1 mvninstall 13m 42s trunk passed +1 mvnsite 0m 57s trunk passed       Patch Compile Tests +1 mvnsite 0m 49s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues.       Other Tests +1 asflicense 0m 14s The patch does not generate ASF License warnings. 16m 24s Subsystem Report/Notes Docker Image:yetus/hadoop:71bbb86 JIRA Issue HADOOP-13714 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12887413/HADOOP-13714.007.patch Optional Tests asflicense mvnsite uname Linux 29b45af2a324 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 11:05:26 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / fbe06b5 modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/13301/console Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          rkanter Robert Kanter added a comment - - edited

          Another small typo:

          The components of Apache Hadoop way have dependencies that include...


          Should be "... Apache Hadoop [m]ay have ..."

          +1 LGTM after that

          Show
          rkanter Robert Kanter added a comment - - edited Another small typo: The components of Apache Hadoop way have dependencies that include... Should be "... Apache Hadoop [m]ay have ..." +1 LGTM after that
          Hide
          templedf Daniel Templeton added a comment -

          Fixed very Dvorak typo.

          Show
          templedf Daniel Templeton added a comment - Fixed very Dvorak typo.
          Hide
          templedf Daniel Templeton added a comment -

          Thanks, Robert Kanter. If there are no other comments or objections, I'll plan to commit this at the end of the day so we can close it out for beta 1. I'll file a couple follow up JIRAs to derive some user and downstream docs from this spec.

          Show
          templedf Daniel Templeton added a comment - Thanks, Robert Kanter . If there are no other comments or objections, I'll plan to commit this at the end of the day so we can close it out for beta 1. I'll file a couple follow up JIRAs to derive some user and downstream docs from this spec.
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 15s Docker mode activated.
                Prechecks
          +1 @author 0m 0s The patch does not contain any @author tags.
                trunk Compile Tests
          +1 mvninstall 13m 32s trunk passed
          +1 mvnsite 0m 54s trunk passed
                Patch Compile Tests
          +1 mvnsite 0m 49s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
                Other Tests
          +1 asflicense 0m 14s The patch does not generate ASF License warnings.
          16m 9s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:71bbb86
          JIRA Issue HADOOP-13714
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12887418/HADOOP-13714.008.patch
          Optional Tests asflicense mvnsite
          uname Linux 9c4d9c9d6923 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 11:05:26 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / fbe06b5
          modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/13303/console
          Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 15s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags.       trunk Compile Tests +1 mvninstall 13m 32s trunk passed +1 mvnsite 0m 54s trunk passed       Patch Compile Tests +1 mvnsite 0m 49s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues.       Other Tests +1 asflicense 0m 14s The patch does not generate ASF License warnings. 16m 9s Subsystem Report/Notes Docker Image:yetus/hadoop:71bbb86 JIRA Issue HADOOP-13714 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12887418/HADOOP-13714.008.patch Optional Tests asflicense mvnsite uname Linux 9c4d9c9d6923 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 11:05:26 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / fbe06b5 modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/13303/console Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          templedf Daniel Templeton added a comment - - edited

          Thanks to all the reviewers for this JIRA! (Steve Loughran, Allen Wittenauer, Karthik Kambatla, Robert Kanter, Miklos Szegedi, Akira Ajisaka, Chris Douglas, ...) Committed to trunk. Filed HADOOP-14875 and HADOOP-14876 to finish the work of creating user docs from this spec.

          Show
          templedf Daniel Templeton added a comment - - edited Thanks to all the reviewers for this JIRA! ( Steve Loughran , Allen Wittenauer , Karthik Kambatla , Robert Kanter , Miklos Szegedi , Akira Ajisaka , Chris Douglas , ...) Committed to trunk. Filed HADOOP-14875 and HADOOP-14876 to finish the work of creating user docs from this spec.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12892 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12892/)
          HADOOP-13714. Tighten up our compatibility guidelines for Hadoop 3 (templedf: rev 7618fa9194b40454405f11a25bec4e2d79506912)

          • (edit) hadoop-common-project/hadoop-common/src/site/markdown/InterfaceClassification.md
          • (edit) hadoop-common-project/hadoop-common/src/site/markdown/Compatibility.md
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12892 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12892/ ) HADOOP-13714 . Tighten up our compatibility guidelines for Hadoop 3 (templedf: rev 7618fa9194b40454405f11a25bec4e2d79506912) (edit) hadoop-common-project/hadoop-common/src/site/markdown/InterfaceClassification.md (edit) hadoop-common-project/hadoop-common/src/site/markdown/Compatibility.md
          Hide
          stevel@apache.org Steve Loughran added a comment -

          See if the new text makes sense to you.

          Didn't get a chance to review these, given they came out on a Friday evening and I actually chose to spend this weekend offlne until monday morning

          Can I remind people that its good to get some consensus from all the people reviewing something, and rushing something out on what is a Friday evening in some time zones is not the way to achieve this.

          Show
          stevel@apache.org Steve Loughran added a comment - See if the new text makes sense to you. Didn't get a chance to review these, given they came out on a Friday evening and I actually chose to spend this weekend offlne until monday morning Can I remind people that its good to get some consensus from all the people reviewing something, and rushing something out on what is a Friday evening in some time zones is not the way to achieve this.
          Hide
          chris.douglas Chris Douglas added a comment -

          Sorry, I didn't have time to review this in detail before it was committed. This section may be overly restrictive, as it is currently phrased:

          +### Native Dependencies
          +
          +Hadoop includes several native components, including compression, the
          +container executor binary, and various native integrations. These native
          +components introduce a set of native dependencies for Hadoop, both at compile
          +time and at runtime, such as cmake, gcc, zlib, etc. This set of native
          +dependencies is part of the Hadoop ABI.
           
           #### Policy
           
          -The behavior of API may be changed to fix incorrect behavior, such a change to be accompanied by updating existing buggy tests or adding tests in cases there were none prior to the change.
          +The minimum required versions of the native components on which Hadoop depends
          +at compile time and/or runtime SHALL be considered
          +[Stable](./InterfaceClassification.html#Stable). Changes to the minimum
          +required versions MUST NOT increase between minor releases within a major
          +version.
          

          If any native dependency or its toolchain has a security vulnerability, then we're going to upgrade it. We may need to replace parts of this toolchain if its license changes, if it becomes obsolete, or if we want to add a library that requires upgrading other native dependencies.

          Show
          chris.douglas Chris Douglas added a comment - Sorry, I didn't have time to review this in detail before it was committed. This section may be overly restrictive, as it is currently phrased: +### Native Dependencies + +Hadoop includes several native components, including compression, the +container executor binary, and various native integrations. These native +components introduce a set of native dependencies for Hadoop, both at compile +time and at runtime, such as cmake, gcc, zlib, etc. This set of native +dependencies is part of the Hadoop ABI. #### Policy -The behavior of API may be changed to fix incorrect behavior, such a change to be accompanied by updating existing buggy tests or adding tests in cases there were none prior to the change. +The minimum required versions of the native components on which Hadoop depends +at compile time and/or runtime SHALL be considered +[Stable](./InterfaceClassification.html#Stable). Changes to the minimum +required versions MUST NOT increase between minor releases within a major +version. If any native dependency or its toolchain has a security vulnerability, then we're going to upgrade it. We may need to replace parts of this toolchain if its license changes, if it becomes obsolete, or if we want to add a library that requires upgrading other native dependencies.
          Hide
          chris.douglas Chris Douglas added a comment -
          Show
          chris.douglas Chris Douglas added a comment - Filed HADOOP-14897

            People

            • Assignee:
              templedf Daniel Templeton
              Reporter:
              kasha Karthik Kambatla
            • Votes:
              0 Vote for this issue
              Watchers:
              23 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development