Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.7.0
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: documentation
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change
    • Release Note:
      Hide
      <!-- markdown -->
      * The release notes now only contains JIRA issues with incompatible changes and actual release notes. The generated format has been changed from HTML to markdown.

      * The changelog is now automatically generated from data stored in JIRA rather than manually maintained. The format has been changed from pure text to markdown as well as containing more of the information that was previously stored in the release notes.

      * In order to generate the changes file, python must be installed.

      * New -Preleasedocs profile added to maven in order to trigger this functionality.
      Show
      <!-- markdown --> * The release notes now only contains JIRA issues with incompatible changes and actual release notes. The generated format has been changed from HTML to markdown. * The changelog is now automatically generated from data stored in JIRA rather than manually maintained. The format has been changed from pure text to markdown as well as containing more of the information that was previously stored in the release notes. * In order to generate the changes file, python must be installed. * New -Preleasedocs profile added to maven in order to trigger this functionality.

      Description

      The current way we generate these build artifacts is awful. Plus they are ugly and, in the case of release notes, very hard to pick out what is important.

      1. HADOOP-11731-00.patch
        12 kB
        Allen Wittenauer
      2. HADOOP-11731-01.patch
        22 kB
        Allen Wittenauer
      3. HADOOP-11731-03.patch
        23 kB
        Allen Wittenauer
      4. HADOOP-11731-04.patch
        27 kB
        Allen Wittenauer
      5. HADOOP-11731-05.patch
        28 kB
        Allen Wittenauer
      6. HADOOP-11731-06.patch
        28 kB
        Allen Wittenauer
      7. HADOOP-11731-07.patch
        28 kB
        Allen Wittenauer

        Issue Links

          Activity

          Hide
          aw Allen Wittenauer added a comment -

          HADOOP-11791 has a patch to include the old versions in the appropriate place. When it is run again, --index should kick off and provide the necessary links.

          Show
          aw Allen Wittenauer added a comment - HADOOP-11791 has a patch to include the old versions in the appropriate place. When it is run again, --index should kick off and provide the necessary links.
          Hide
          aw Allen Wittenauer added a comment -
          Show
          aw Allen Wittenauer added a comment - BTW: https://github.com/aw-altiscale/eco-release-metadata is usually updated weekly.
          Hide
          aw Allen Wittenauer added a comment -

          Hmm, so I work mostly on HDFS and Common, so would somewhat prefer not seeing YARN and MR mixed in. If you feel very strongly I can do 4-in-1, but I'm also willing to do the work to separate them out.

          Yes, I feel very strongly about this one. Separating out the release notes into subprojects may be one of the most end user unfriendly things we do (#1 is the ridiculous number of incompatible changes). Again, as I pointed out above, the ONLY people who truly, really care about these things being separated out are the committers.... and, frankly, the reasoning there is void, given they should know how to build their own set of notes if they really are too lazy to grep them out.. Everyone else wants to see one file that has all the big changes aka a single, consolidated release note file.

          Do you have a sense for the state of JIRA fix versions? If it's just a matter of running the script and fixing the dates, easy. If more gardening is necessary, not so easy.

          I know that up to 2.6.0 is correct. But lint mode should tell you a lot about people doing things like assigning multiple versions to a JIRA.

          Show
          aw Allen Wittenauer added a comment - Hmm, so I work mostly on HDFS and Common, so would somewhat prefer not seeing YARN and MR mixed in. If you feel very strongly I can do 4-in-1, but I'm also willing to do the work to separate them out. Yes, I feel very strongly about this one. Separating out the release notes into subprojects may be one of the most end user unfriendly things we do (#1 is the ridiculous number of incompatible changes). Again, as I pointed out above, the ONLY people who truly, really care about these things being separated out are the committers.... and, frankly, the reasoning there is void, given they should know how to build their own set of notes if they really are too lazy to grep them out.. Everyone else wants to see one file that has all the big changes aka a single, consolidated release note file. Do you have a sense for the state of JIRA fix versions? If it's just a matter of running the script and fixing the dates, easy. If more gardening is necessary, not so easy. I know that up to 2.6.0 is correct. But lint mode should tell you a lot about people doing things like assigning multiple versions to a JIRA.
          Hide
          andrew.wang Andrew Wang added a comment -

          Cool. I'm going to start by pulling in the changes from the branch to trunk/branch-2, and do whatever additional development is required there. Might ping you for reviews.

          You only need it in common with the proper flags....

          Hmm, so I work mostly on HDFS and Common, so would somewhat prefer not seeing YARN and MR mixed in. If you feel very strongly I can do 4-in-1, but I'm also willing to do the work to separate them out.

          It's not about backwards, it's about forwards....

          We've kept the docs for branch-1 releases and I hope 2.7.x would have similar legs, but fair enough. Do you have a sense for the state of JIRA fix versions? If it's just a matter of running the script and fixing the dates, easy. If more gardening is necessary, not so easy.

          Show
          andrew.wang Andrew Wang added a comment - Cool. I'm going to start by pulling in the changes from the branch to trunk/branch-2, and do whatever additional development is required there. Might ping you for reviews. You only need it in common with the proper flags.... Hmm, so I work mostly on HDFS and Common, so would somewhat prefer not seeing YARN and MR mixed in. If you feel very strongly I can do 4-in-1, but I'm also willing to do the work to separate them out. It's not about backwards, it's about forwards.... We've kept the docs for branch-1 releases and I hope 2.7.x would have similar legs, but fair enough. Do you have a sense for the state of JIRA fix versions? If it's just a matter of running the script and fixing the dates, easy. If more gardening is necessary, not so easy.
          Hide
          aw Allen Wittenauer added a comment -

          Fix the -Preleasedocs profile (looks like we need the per-project logic? I only see it in common's pom right now)

          You only need it in common with the proper flags. The generated website points to the index which contains all of them. Separating them out is only useful to maybe some of the PMC and committers. Everyone else treats Hadoop as one package. Pretending otherwise is dumb.

          You mentioned the HowToRelease wiki instructions being incorrect, but I didn't catch what exactly was off. It says to close the JIRAs as the last step, which still seems okay.

          There's some other stuff, but I don't remember off the top of my head. It mainly had to do with the date reported in the generated notes vs. the tar ball and keeping them in sync.

          Regarding the index, are you recommending we run the script for historical releases?

          Yes.

          Since we still host the docs for prior releases, seems like people could just look at the old CHANGES.txt there.

          It's not about backwards, it's about forwards. At some point, (probably Hadoop 2.12 or Hadoop 2.13, say autumn 2017 given current pace), there will be no more releases on the website that still have a CHANGES.txt file. Then what? If we generate the historical data, the answer is obvious.

          Show
          aw Allen Wittenauer added a comment - Fix the -Preleasedocs profile (looks like we need the per-project logic? I only see it in common's pom right now) You only need it in common with the proper flags. The generated website points to the index which contains all of them. Separating them out is only useful to maybe some of the PMC and committers. Everyone else treats Hadoop as one package. Pretending otherwise is dumb. You mentioned the HowToRelease wiki instructions being incorrect, but I didn't catch what exactly was off. It says to close the JIRAs as the last step, which still seems okay. There's some other stuff, but I don't remember off the top of my head. It mainly had to do with the date reported in the generated notes vs. the tar ball and keeping them in sync. Regarding the index, are you recommending we run the script for historical releases? Yes. Since we still host the docs for prior releases, seems like people could just look at the old CHANGES.txt there. It's not about backwards, it's about forwards. At some point, (probably Hadoop 2.12 or Hadoop 2.13, say autumn 2017 given current pace), there will be no more releases on the website that still have a CHANGES.txt file. Then what? If we generate the historical data, the answer is obvious.
          Hide
          andrew.wang Andrew Wang added a comment -

          Thanks Allen. So if I understand you correctly, the remaining work is something like as follows:

          • Pull aforementioned changes from HADOOP-12111 branch to trunk/branch-2
          • Fix the -Preleasedocs profile (looks like we need the per-project logic? I only see it in common's pom right now)
          • Fix the create-release script (HADOOP-11793) and update the instructions telling RMs to run the lint mode.
          • You mentioned the HowToRelease wiki instructions being incorrect, but I didn't catch what exactly was off. It says to close the JIRAs as the last step, which still seems okay.

          Regarding the index, are you recommending we run the script for historical releases? Since we still host the docs for prior releases, seems like people could just look at the old CHANGES.txt there. I ask because I know you did a lot of JIRA gardening when testing the tool, and it sounded like we never fully reconciled JIRA and manually entered CHANGES.txt state.

          Please add anything else I might have missed, I'm willing to take the brunt of the work (though your help would of course be appreciated too ).

          Show
          andrew.wang Andrew Wang added a comment - Thanks Allen. So if I understand you correctly, the remaining work is something like as follows: Pull aforementioned changes from HADOOP-12111 branch to trunk/branch-2 Fix the -Preleasedocs profile (looks like we need the per-project logic? I only see it in common's pom right now) Fix the create-release script ( HADOOP-11793 ) and update the instructions telling RMs to run the lint mode. You mentioned the HowToRelease wiki instructions being incorrect, but I didn't catch what exactly was off. It says to close the JIRAs as the last step, which still seems okay. Regarding the index, are you recommending we run the script for historical releases? Since we still host the docs for prior releases, seems like people could just look at the old CHANGES.txt there. I ask because I know you did a lot of JIRA gardening when testing the tool, and it sounded like we never fully reconciled JIRA and manually entered CHANGES.txt state. Please add anything else I might have missed, I'm willing to take the brunt of the work (though your help would of course be appreciated too ).
          Hide
          aw Allen Wittenauer added a comment -

          You'll need some modifications to pom.xml to use the version from yetus. I wrote a patch somewhere, but haven't tested it very much. Also keep in mind that releasedocmaker was built from the perspective that there would be multiple versions in the release directory so that it could build an index to all of them.

          Show
          aw Allen Wittenauer added a comment - You'll need some modifications to pom.xml to use the version from yetus. I wrote a patch somewhere, but haven't tested it very much. Also keep in mind that releasedocmaker was built from the perspective that there would be multiple versions in the release directory so that it could build an index to all of them.
          Hide
          andrew.wang Andrew Wang added a comment -

          Aha, thanks for the heads up. I just pulled in HADOOP-11797 and HADOOP-11813.

          I just did some test cherry-picks of everything that touches releasedocmaker in the HADOOP-12111 branch, and they came back clean. Cool if I just pull them down to trunk/branch-2?

          Show
          andrew.wang Andrew Wang added a comment - Aha, thanks for the heads up. I just pulled in HADOOP-11797 and HADOOP-11813 . I just did some test cherry-picks of everything that touches releasedocmaker in the HADOOP-12111 branch, and they came back clean. Cool if I just pull them down to trunk/branch-2?
          Hide
          aw Allen Wittenauer added a comment -

          I hope you realize there was more than just this JIRA for trunk's version of releasedocmaker (which is also several patches behind the one sitting in the Yetus branch)....

          Show
          aw Allen Wittenauer added a comment - I hope you realize there was more than just this JIRA for trunk's version of releasedocmaker (which is also several patches behind the one sitting in the Yetus branch)....
          Hide
          andrew.wang Andrew Wang added a comment -

          I just backported this to branch-2, just a minor conflict in site.xml I fixed up. Thanks again to Allen for working on this.

          Show
          andrew.wang Andrew Wang added a comment - I just backported this to branch-2, just a minor conflict in site.xml I fixed up. Thanks again to Allen for working on this.
          Hide
          cmccabe Colin P. McCabe added a comment -

          Steve said: I don't think CHANGES.TXT works that well. We may think it does, but that's because without the tooling to validate it, stuff doesn't get added and so it can omit a lot of work. Then there's the problem of merging across branches, and dealing with race conditions/commit conflict between other people's work and yours.

          Absolutely. CHANGES.txt does not work that well for many reasons. It's often incorrect, as Allen pointed out, since it's entirely manually created. It creates many spurious merge conflicts.

          Tsz Wo Nicholas Sze wrote: Again, I do not oppose using the new tool. However, we do need a transition period to see if it indeed works well.

          I agree. However, given that we are going to be maintaining branch-2 for at least another 2 years (we don't even have a 3.x release roadmap yet), it seems like that transition period should be in the 2.8 timeframe. It would also be nice to move to this system for point releases. Otherwise, we end up doing CHANGES.txt anyway for backports.

          Show
          cmccabe Colin P. McCabe added a comment - Steve said: I don't think CHANGES.TXT works that well. We may think it does, but that's because without the tooling to validate it, stuff doesn't get added and so it can omit a lot of work. Then there's the problem of merging across branches, and dealing with race conditions/commit conflict between other people's work and yours. Absolutely. CHANGES.txt does not work that well for many reasons. It's often incorrect, as Allen pointed out, since it's entirely manually created. It creates many spurious merge conflicts. Tsz Wo Nicholas Sze wrote: Again, I do not oppose using the new tool. However, we do need a transition period to see if it indeed works well. I agree. However, given that we are going to be maintaining branch-2 for at least another 2 years (we don't even have a 3.x release roadmap yet), it seems like that transition period should be in the 2.8 timeframe. It would also be nice to move to this system for point releases. Otherwise, we end up doing CHANGES.txt anyway for backports.
          Hide
          aw Allen Wittenauer added a comment -

          Why don't we use the git commit message instead of JIRA summary?

          This has already been somewhat covered on the mailing list, but:

          a) because we have to hit JIRA for the other information anyway.
          b) parsing the commit logs is trickier than one thinks.

          The git commit message is supposed to be the same as the entry in CHAGNES.txt.

          Looking at https://wiki.apache.org/hadoop/HowToCommit, that isn't true. In the vast majority of cases, however, I'd place money that JIRA summary==CHANGES.txt entry==git commit log entry. But typos happen and these are much easier to fix in the JIRA summary by leaps and bounds.

          I disagree since the automation may have bugs or may not work at all.

          Then our release notes have been broken for over 2 years, since it's effectively the same code but with different formatting.


          At this point, this JIRA is beating a horse that's been long dead. The code is committed. If there are issues with the code, then file new JIRAs against it. I'm pretty much going to ignore any new messages here and continue working on getting the rest of the release process updated to take advantage of it.

          Show
          aw Allen Wittenauer added a comment - Why don't we use the git commit message instead of JIRA summary? This has already been somewhat covered on the mailing list, but: a) because we have to hit JIRA for the other information anyway. b) parsing the commit logs is trickier than one thinks. The git commit message is supposed to be the same as the entry in CHAGNES.txt. Looking at https://wiki.apache.org/hadoop/HowToCommit , that isn't true. In the vast majority of cases, however, I'd place money that JIRA summary==CHANGES.txt entry==git commit log entry. But typos happen and these are much easier to fix in the JIRA summary by leaps and bounds. I disagree since the automation may have bugs or may not work at all. Then our release notes have been broken for over 2 years, since it's effectively the same code but with different formatting. At this point, this JIRA is beating a horse that's been long dead. The code is committed. If there are issues with the code, then file new JIRAs against it. I'm pretty much going to ignore any new messages here and continue working on getting the rest of the release process updated to take advantage of it.
          Hide
          szetszwo Tsz Wo Nicholas Sze added a comment -

          Why don't we use the git commit message instead of JIRA summary? The git commit message is supposed to be the same as the entry in CHAGNES.txt.

          BTW, I guess the tool won't handle the case that there are multiple contributors if it takes JIRA assignee as the contributor. We should also retrieve the contributor list from the commit message.

          > ... Any effort put into updating those files is wasted effort given that it's automated now.

          I disagree since the automation may have bugs or may not work at all.

          Show
          szetszwo Tsz Wo Nicholas Sze added a comment - Why don't we use the git commit message instead of JIRA summary? The git commit message is supposed to be the same as the entry in CHAGNES.txt. BTW, I guess the tool won't handle the case that there are multiple contributors if it takes JIRA assignee as the contributor. We should also retrieve the contributor list from the commit message. > ... Any effort put into updating those files is wasted effort given that it's automated now. I disagree since the automation may have bugs or may not work at all.
          Hide
          aw Allen Wittenauer added a comment -

          I've created HADOOP-11807 to add a lint mode.

          we do need a transition period to see if it indeed works well.

          ... which is a good reason to target trunk rather than branch-2. trunks' changes.txt files are very very wrong. Any effort put into updating those files is wasted effort given that it's automated now.

          Show
          aw Allen Wittenauer added a comment - I've created HADOOP-11807 to add a lint mode. we do need a transition period to see if it indeed works well. ... which is a good reason to target trunk rather than branch-2. trunks' changes.txt files are very very wrong. Any effort put into updating those files is wasted effort given that it's automated now.
          Hide
          szetszwo Tsz Wo Nicholas Sze added a comment -

          Again, I do not oppose using the new tool. However, we do need a transition period to see if it indeed works well.

          Show
          szetszwo Tsz Wo Nicholas Sze added a comment - Again, I do not oppose using the new tool. However, we do need a transition period to see if it indeed works well.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          I don't think CHANGES.TXT works that well. We may think it does, but that's because without the tooling to validate it, stuff doesn't get added and so it can omit a lot of work. Then there's the problem of merging across branches, and dealing with race conditions/commit conflict between other people's work and yours.

          automated generation not only gives us the change list, it adds the option of generating a hyperlinked version where you can actually click through the JIRAs.

          I do recognise the concerns about JIRA naming and tagging —but that's a process thing that is correctable. If we can put the effort in to maintaining CHANGES.TXT, we can do it for JIRAs. Many use cases will be easier. For example commit something to trunk and then backport it, today, is:

          1. pull trunk
          2. apply patch to trunk
          3. edit changes.txt
          4. commit
          5. attempt to push
          6. if unlucky: fix CHANGES.TXT conflict & commit, push.
          7. close JIRA

          The backport workflow becomes

          1. switch to branch-2
          2. pull
          3. apply the code bits of the patch
          4. edit changes.txt
          5. commit
          6. go to trunk
          7. fix changes.txt
          8. push
          9. remember to edit JIRA

          Forward porting is easier: edit & commit branch-2, cherry pick the patch, push both

          Now think of a JIRA-only workflow

          1. apply patch to trunk
          2. commit
          3. close in JIRA with 3.0.0 version

          you'll only get commit conflict if someone patched the source files, so commits are more likely to go through.

          backporting

          1. cherry pick patch
          2. commit & push
          3. in JIRA, change fix version.

          It's backporting & cross version code where this stuff really excels.

          Show
          stevel@apache.org Steve Loughran added a comment - I don't think CHANGES.TXT works that well. We may think it does, but that's because without the tooling to validate it, stuff doesn't get added and so it can omit a lot of work. Then there's the problem of merging across branches, and dealing with race conditions/commit conflict between other people's work and yours. automated generation not only gives us the change list, it adds the option of generating a hyperlinked version where you can actually click through the JIRAs. I do recognise the concerns about JIRA naming and tagging —but that's a process thing that is correctable. If we can put the effort in to maintaining CHANGES.TXT, we can do it for JIRAs. Many use cases will be easier. For example commit something to trunk and then backport it, today, is: pull trunk apply patch to trunk edit changes.txt commit attempt to push if unlucky: fix CHANGES.TXT conflict & commit, push. close JIRA The backport workflow becomes switch to branch-2 pull apply the code bits of the patch edit changes.txt commit go to trunk fix changes.txt push remember to edit JIRA Forward porting is easier: edit & commit branch-2, cherry pick the patch, push both Now think of a JIRA-only workflow apply patch to trunk commit close in JIRA with 3.0.0 version you'll only get commit conflict if someone patched the source files, so commits are more likely to go through. backporting cherry pick patch commit & push in JIRA, change fix version. It's backporting & cross version code where this stuff really excels.
          Hide
          chris.douglas Chris Douglas added a comment -

          An example of JIRA information being inaccurate is that the assignee is missing...

          Some of these inconsistencies are easier to spot with the tool, and fix both JIRA and the release notes. Allen Wittenauer, would it be difficult to add a report for cases where the assignee is missing, incompatible changes don't have release notes, etc? I see it prints a warning for the latter, but a lint-style report could help RMs fixup JIRA as part of rolling a release.

          The single line summary at the top of a JIRA isn't what you are putting in CHANGES.txt?

          I sometimes add more detail in the commit message, but if committers get used to setting the summary and release notes this seems workable.

          Show
          chris.douglas Chris Douglas added a comment - An example of JIRA information being inaccurate is that the assignee is missing... Some of these inconsistencies are easier to spot with the tool, and fix both JIRA and the release notes. Allen Wittenauer , would it be difficult to add a report for cases where the assignee is missing, incompatible changes don't have release notes, etc? I see it prints a warning for the latter, but a lint-style report could help RMs fixup JIRA as part of rolling a release. The single line summary at the top of a JIRA isn't what you are putting in CHANGES.txt? I sometimes add more detail in the commit message, but if committers get used to setting the summary and release notes this seems workable.
          Hide
          aw Allen Wittenauer added a comment -

          We won't update it since CHANGES.txt and jira summary are not supposed to be the same.

          Wait, what? The single line summary at the top of a JIRA isn't what you are putting in CHANGES.txt? I think you might be the only person who isn't using the summary line for the changes.txt that who any sort of semi-regular commits.

          The entry in CHANGES.txt is a concise statement about the change committed.

          In other words, exactly what the JIRA summary is supposed to be as well, but used prior to commit.

          Show
          aw Allen Wittenauer added a comment - We won't update it since CHANGES.txt and jira summary are not supposed to be the same. Wait, what? The single line summary at the top of a JIRA isn't what you are putting in CHANGES.txt? I think you might be the only person who isn't using the summary line for the changes.txt that who any sort of semi-regular commits. The entry in CHANGES.txt is a concise statement about the change committed. In other words, exactly what the JIRA summary is supposed to be as well, but used prior to commit.
          Hide
          szetszwo Tsz Wo Nicholas Sze added a comment -

          > ... the changes.txt gets updated when it's incorrect vs the jira summary

          We won't update it since CHANGES.txt and jira summary are not supposed to be the same.

          > the generated results provide links to the jira, providing easy click history to see what has happened.

          The entry in CHANGES.txt is a concise statement about the change committed. It takes much more time to understand what was going to read the JIRA itself, especially, when there was a long discussion.

          Show
          szetszwo Tsz Wo Nicholas Sze added a comment - > ... the changes.txt gets updated when it's incorrect vs the jira summary We won't update it since CHANGES.txt and jira summary are not supposed to be the same. > the generated results provide links to the jira, providing easy click history to see what has happened. The entry in CHANGES.txt is a concise statement about the change committed. It takes much more time to understand what was going to read the JIRA itself, especially, when there was a long discussion.
          Hide
          aw Allen Wittenauer added a comment -

          Have you compared the current CHANGES.txt with the generated change log? I beg the diff is long.

          The summary lines no. The content that matters? Very much so. I spent about a man month repairing fix version information for all of branch-2 and trunk. To the point that I know that this statement is patently false:

          However, it worked well in the past for many releases.

          I'd argue it has failed tremendously in past releases.

          • the number of incompatible changes that have been committed as bug fixes or what not with zero release notes to warn the user that we've screwed them over.
          • CHANGES.txt is missing hundreds of commits in both branch-2 and trunk. The number of JIRAs that needed to have 2.7.0 added to the fixversion vs. autogenerated was 3.
          • Right now, branch-2 and 2.7.0 changes.txt files even disagree about what patches are in 2.7.0. That's not success at all.

          The concern about the summary line is sort of moot since:

          a) the reverse is also true: it's rare that the changes.txt gets updated when it's incorrect vs the jira summary
          b) the generated results provide links to the jira, providing easy click history to see what has happened.

          Show
          aw Allen Wittenauer added a comment - Have you compared the current CHANGES.txt with the generated change log? I beg the diff is long. The summary lines no. The content that matters? Very much so. I spent about a man month repairing fix version information for all of branch-2 and trunk. To the point that I know that this statement is patently false: However, it worked well in the past for many releases. I'd argue it has failed tremendously in past releases. the number of incompatible changes that have been committed as bug fixes or what not with zero release notes to warn the user that we've screwed them over. CHANGES.txt is missing hundreds of commits in both branch-2 and trunk. The number of JIRAs that needed to have 2.7.0 added to the fixversion vs. autogenerated was 3 . Right now, branch-2 and 2.7.0 changes.txt files even disagree about what patches are in 2.7.0. That's not success at all. The concern about the summary line is sort of moot since: a) the reverse is also true: it's rare that the changes.txt gets updated when it's incorrect vs the jira summary b) the generated results provide links to the jira, providing easy click history to see what has happened.
          Hide
          szetszwo Tsz Wo Nicholas Sze added a comment -

          An example of JIRA information being inaccurate is that the assignee is missing in this JIRA HADOOP-11731. From CHANGES.txt, we see that aw has worked on it.

          //hadoop-common-project/hadoop-common/CHANGES.txt 
              HADOOP-11731. Rework the changelog and releasenotes (aw)
          
          Show
          szetszwo Tsz Wo Nicholas Sze added a comment - An example of JIRA information being inaccurate is that the assignee is missing in this JIRA HADOOP-11731 . From CHANGES.txt, we see that aw has worked on it. //hadoop-common-project/hadoop-common/CHANGES.txt HADOOP-11731. Rework the changelog and releasenotes (aw)
          Hide
          szetszwo Tsz Wo Nicholas Sze added a comment -

          Generating change log from JIRA is a good idea. It bases on an assumption that each JIRA has an accurate summary (a.k.a. JIRA title) to reflect the committed change. Unfortunately, the assumption is invalid for many cases since we never enforce that the JIRA summary must be the same as the change log. Have you compared the current CHANGES.txt with the generated change log? I beg the diff is long.

          Besides, after a release R1 is out, someone may (accidentally or intentionally) modify the JIRA summary. Then, the entry for the same item in a later release R2 could be different from the one in R1.

          Yet another concern is that non-committers can add/edit JIRA summary but only committers could modify CHANGES.txt.

          I agree that manually editing CHANGES.txt is not a perfect solution. However, it worked well in the past for many releases. I suggest we keep the current dev workflow. Try using the new script provided here to generate the next release. If everything works well, we shell remove CHANGES.txt and revise the dev workflow. What do you think?

          Show
          szetszwo Tsz Wo Nicholas Sze added a comment - Generating change log from JIRA is a good idea. It bases on an assumption that each JIRA has an accurate summary (a.k.a. JIRA title) to reflect the committed change. Unfortunately, the assumption is invalid for many cases since we never enforce that the JIRA summary must be the same as the change log. Have you compared the current CHANGES.txt with the generated change log? I beg the diff is long. Besides, after a release R1 is out, someone may (accidentally or intentionally) modify the JIRA summary. Then, the entry for the same item in a later release R2 could be different from the one in R1. Yet another concern is that non-committers can add/edit JIRA summary but only committers could modify CHANGES.txt. I agree that manually editing CHANGES.txt is not a perfect solution. However, it worked well in the past for many releases. I suggest we keep the current dev workflow. Try using the new script provided here to generate the next release. If everything works well, we shell remove CHANGES.txt and revise the dev workflow. What do you think?
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #2101 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2101/)
          HADOOP-11731. Rework the changelog and releasenotes (aw) (aw: rev f383fd9b6caf4557613250c5c218b1a1b65a212b)

          • hadoop-common-project/hadoop-common/pom.xml
          • BUILDING.txt
          • dev-support/releasedocmaker.py
          • hadoop-common-project/hadoop-common/CHANGES.txt
          • hadoop-project/src/site/site.xml
          • dev-support/relnotes.py
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2101 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2101/ ) HADOOP-11731 . Rework the changelog and releasenotes (aw) (aw: rev f383fd9b6caf4557613250c5c218b1a1b65a212b) hadoop-common-project/hadoop-common/pom.xml BUILDING.txt dev-support/releasedocmaker.py hadoop-common-project/hadoop-common/CHANGES.txt hadoop-project/src/site/site.xml dev-support/relnotes.py
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #151 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/151/)
          HADOOP-11731. Rework the changelog and releasenotes (aw) (aw: rev f383fd9b6caf4557613250c5c218b1a1b65a212b)

          • dev-support/relnotes.py
          • hadoop-project/src/site/site.xml
          • hadoop-common-project/hadoop-common/pom.xml
          • dev-support/releasedocmaker.py
          • hadoop-common-project/hadoop-common/CHANGES.txt
          • BUILDING.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #151 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/151/ ) HADOOP-11731 . Rework the changelog and releasenotes (aw) (aw: rev f383fd9b6caf4557613250c5c218b1a1b65a212b) dev-support/relnotes.py hadoop-project/src/site/site.xml hadoop-common-project/hadoop-common/pom.xml dev-support/releasedocmaker.py hadoop-common-project/hadoop-common/CHANGES.txt BUILDING.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #142 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/142/)
          HADOOP-11731. Rework the changelog and releasenotes (aw) (aw: rev f383fd9b6caf4557613250c5c218b1a1b65a212b)

          • BUILDING.txt
          • hadoop-common-project/hadoop-common/CHANGES.txt
          • dev-support/relnotes.py
          • dev-support/releasedocmaker.py
          • hadoop-common-project/hadoop-common/pom.xml
          • hadoop-project/src/site/site.xml
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #142 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/142/ ) HADOOP-11731 . Rework the changelog and releasenotes (aw) (aw: rev f383fd9b6caf4557613250c5c218b1a1b65a212b) BUILDING.txt hadoop-common-project/hadoop-common/CHANGES.txt dev-support/relnotes.py dev-support/releasedocmaker.py hadoop-common-project/hadoop-common/pom.xml hadoop-project/src/site/site.xml
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2083 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2083/)
          HADOOP-11731. Rework the changelog and releasenotes (aw) (aw: rev f383fd9b6caf4557613250c5c218b1a1b65a212b)

          • dev-support/relnotes.py
          • BUILDING.txt
          • hadoop-common-project/hadoop-common/pom.xml
          • hadoop-common-project/hadoop-common/CHANGES.txt
          • hadoop-project/src/site/site.xml
          • dev-support/releasedocmaker.py
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2083 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2083/ ) HADOOP-11731 . Rework the changelog and releasenotes (aw) (aw: rev f383fd9b6caf4557613250c5c218b1a1b65a212b) dev-support/relnotes.py BUILDING.txt hadoop-common-project/hadoop-common/pom.xml hadoop-common-project/hadoop-common/CHANGES.txt hadoop-project/src/site/site.xml dev-support/releasedocmaker.py
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #151 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/151/)
          HADOOP-11731. Rework the changelog and releasenotes (aw) (aw: rev f383fd9b6caf4557613250c5c218b1a1b65a212b)

          • hadoop-common-project/hadoop-common/CHANGES.txt
          • dev-support/relnotes.py
          • hadoop-common-project/hadoop-common/pom.xml
          • BUILDING.txt
          • hadoop-project/src/site/site.xml
          • dev-support/releasedocmaker.py
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #151 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/151/ ) HADOOP-11731 . Rework the changelog and releasenotes (aw) (aw: rev f383fd9b6caf4557613250c5c218b1a1b65a212b) hadoop-common-project/hadoop-common/CHANGES.txt dev-support/relnotes.py hadoop-common-project/hadoop-common/pom.xml BUILDING.txt hadoop-project/src/site/site.xml dev-support/releasedocmaker.py
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #885 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/885/)
          HADOOP-11731. Rework the changelog and releasenotes (aw) (aw: rev f383fd9b6caf4557613250c5c218b1a1b65a212b)

          • hadoop-common-project/hadoop-common/pom.xml
          • hadoop-project/src/site/site.xml
          • hadoop-common-project/hadoop-common/CHANGES.txt
          • BUILDING.txt
          • dev-support/relnotes.py
          • dev-support/releasedocmaker.py
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #885 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/885/ ) HADOOP-11731 . Rework the changelog and releasenotes (aw) (aw: rev f383fd9b6caf4557613250c5c218b1a1b65a212b) hadoop-common-project/hadoop-common/pom.xml hadoop-project/src/site/site.xml hadoop-common-project/hadoop-common/CHANGES.txt BUILDING.txt dev-support/relnotes.py dev-support/releasedocmaker.py
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-trunk-Commit #7490 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7490/)
          HADOOP-11731. Rework the changelog and releasenotes (aw) (aw: rev f383fd9b6caf4557613250c5c218b1a1b65a212b)

          • hadoop-project/src/site/site.xml
          • dev-support/releasedocmaker.py
          • hadoop-common-project/hadoop-common/pom.xml
          • hadoop-common-project/hadoop-common/CHANGES.txt
          • BUILDING.txt
          • dev-support/relnotes.py
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #7490 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7490/ ) HADOOP-11731 . Rework the changelog and releasenotes (aw) (aw: rev f383fd9b6caf4557613250c5c218b1a1b65a212b) hadoop-project/src/site/site.xml dev-support/releasedocmaker.py hadoop-common-project/hadoop-common/pom.xml hadoop-common-project/hadoop-common/CHANGES.txt BUILDING.txt dev-support/relnotes.py
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12708858/HADOOP-11731-07.patch
          against trunk revision 4d14816.

          +1 @author. The patch does not contain any @author tags.

          +0 tests included. The patch appears to be a documentation patch that doesn't require tests.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common:

          org.apache.hadoop.ipc.TestRPCWaitForProxy

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/6047//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/6047//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12708858/HADOOP-11731-07.patch against trunk revision 4d14816. +1 @author . The patch does not contain any @author tags. +0 tests included . The patch appears to be a documentation patch that doesn't require tests. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-common-project/hadoop-common: org.apache.hadoop.ipc.TestRPCWaitForProxy Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/6047//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/6047//console This message is automatically generated.
          Hide
          aw Allen Wittenauer added a comment -

          I've committed this to trunk.

          Thanks everyone.

          Show
          aw Allen Wittenauer added a comment - I've committed this to trunk. Thanks everyone.
          Hide
          aw Allen Wittenauer added a comment -

          Ha, sorry, we crossed each other.

          I'll see what it takes to fix the Jenkins job. I want to poke around Jenkins for HADOOP-11746 anyway.

          Thanks everyone.

          Show
          aw Allen Wittenauer added a comment - Ha, sorry, we crossed each other. I'll see what it takes to fix the Jenkins job. I want to poke around Jenkins for HADOOP-11746 anyway. Thanks everyone.
          Hide
          aw Allen Wittenauer added a comment -

          -07:

          • Chris Douglas recommended changing the 'Description' heading to 'Summary' to reflect what that field actually is in JIRA.

          Colin P. McCabe, et. al. if you have no other comments, I'll commit this (w/subtasks turned on) tomorrow morning given the +1. I'll follow it up with a message to *-dev and open other JIRAs to finish up moving to automated changes.

          Show
          aw Allen Wittenauer added a comment - -07: Chris Douglas recommended changing the 'Description' heading to 'Summary' to reflect what that field actually is in JIRA. Colin P. McCabe , et. al. if you have no other comments, I'll commit this (w/subtasks turned on) tomorrow morning given the +1. I'll follow it up with a message to *-dev and open other JIRAs to finish up moving to automated changes.
          Hide
          andrew.wang Andrew Wang added a comment -

          I think we should list the subtasks, since they're JIRAs too.

          I'll add that I looked at an earlier version of Allen's script and it looked basically good. If Colin's happy too, let's just get it in and fix the rest later.

          Allen, do you mind also updating the release Jenkins job (https://builds.apache.org/job/HADOOP2_Release_Artifacts_Builder/) so it also runs this script?

          Show
          andrew.wang Andrew Wang added a comment - I think we should list the subtasks, since they're JIRAs too. I'll add that I looked at an earlier version of Allen's script and it looked basically good. If Colin's happy too, let's just get it in and fix the rest later. Allen, do you mind also updating the release Jenkins job ( https://builds.apache.org/job/HADOOP2_Release_Artifacts_Builder/ ) so it also runs this script?
          Hide
          cmccabe Colin P. McCabe added a comment -

          I think we should support subtasks for now. If we later decide they aren't important we can make it optional in the script.

          I actually am completely neutral on whether we should have release notes for multiple releases in one file (or changes for CHANGES.txt). I think if we want to change the format for upcoming releases, we should discuss that on the mailing list first. However, that issue shouldn't hold up this patch. It's easy enough to create a file with release notes for multiple releases by cat'ing together files, or passing multiple revisions on the relnotes.py command line.

          This is certainly a big improvement on what came before, so +1.

          Show
          cmccabe Colin P. McCabe added a comment - I think we should support subtasks for now. If we later decide they aren't important we can make it optional in the script. I actually am completely neutral on whether we should have release notes for multiple releases in one file (or changes for CHANGES.txt). I think if we want to change the format for upcoming releases, we should discuss that on the mailing list first. However, that issue shouldn't hold up this patch. It's easy enough to create a file with release notes for multiple releases by cat'ing together files, or passing multiple revisions on the relnotes.py command line. This is certainly a big improvement on what came before, so +1.
          Hide
          aw Allen Wittenauer added a comment -

          So is this ready to go in?

          Which do we prefer, with all subtasks listed or without?

          Show
          aw Allen Wittenauer added a comment - So is this ready to go in? Which do we prefer, with all subtasks listed or without?
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12708303/HADOOP-11731-06.patch
          against trunk revision cce66ba.

          +1 @author. The patch does not contain any @author tags.

          +0 tests included. The patch appears to be a documentation patch that doesn't require tests.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common:

          org.apache.hadoop.ipc.TestRPCWaitForProxy

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/6028//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/6028//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12708303/HADOOP-11731-06.patch against trunk revision cce66ba. +1 @author . The patch does not contain any @author tags. +0 tests included . The patch appears to be a documentation patch that doesn't require tests. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-common-project/hadoop-common: org.apache.hadoop.ipc.TestRPCWaitForProxy Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/6028//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/6028//console This message is automatically generated.
          Hide
          aw Allen Wittenauer added a comment -

          -06:

          • subtasks section which can easily 2x-3x the changes file
          • removed the decrement code in versions
          Show
          aw Allen Wittenauer added a comment - -06: subtasks section which can easily 2x-3x the changes file removed the decrement code in versions
          Hide
          aw Allen Wittenauer added a comment -

          If I turn on subtasks, then the missing JIRAs from branch-2.7 vs. the generated report is only HADOOP-6857 (and if you read the JIRA, you can see why). Meanwhile, branch-2.7's changes.txt is missing 259 JIRAs, again including subtasks.

          So yea, this is pretty much what I expected: the changes.txt file is completely out of sync with JIRA, with JIRA generally being more accurate.

          Show
          aw Allen Wittenauer added a comment - If I turn on subtasks, then the missing JIRAs from branch-2.7 vs. the generated report is only HADOOP-6857 (and if you read the JIRA, you can see why). Meanwhile, branch-2.7's changes.txt is missing 259 JIRAs, again including subtasks. So yea, this is pretty much what I expected: the changes.txt file is completely out of sync with JIRA, with JIRA generally being more accurate.
          Hide
          aw Allen Wittenauer added a comment -

          Actually, I compared against the 2.7 changes.txt entries listed in branch-2. A quick manual check shows that people aren't updating branch-2 when they change the branch-2.7 changes.txt. Neat! I'll run another analysis here in a bit. But I doubt I'm going to find much different.

          Show
          aw Allen Wittenauer added a comment - Actually, I compared against the 2.7 changes.txt entries listed in branch-2. A quick manual check shows that people aren't updating branch-2 when they change the branch-2.7 changes.txt. Neat! I'll run another analysis here in a bit. But I doubt I'm going to find much different.
          Hide
          aw Allen Wittenauer added a comment -

          So, I went through 2.7. The new code doesn't list subtasks, so of course they weren't in there. The other missing bits were mislabeled JIRAs. Interestingly, the autocode was much more correct than the HDFS and YARN changes.txt files after those are taken into account. (By quite a bit, actually.)

          cc: Vinod Kumar Vavilapalli, just in case he cares about the questionable changes.txt files in the 2.7.0 branch.

          Show
          aw Allen Wittenauer added a comment - So, I went through 2.7. The new code doesn't list subtasks, so of course they weren't in there. The other missing bits were mislabeled JIRAs. Interestingly, the autocode was much more correct than the HDFS and YARN changes.txt files after those are taken into account. (By quite a bit, actually.) cc: Vinod Kumar Vavilapalli , just in case he cares about the questionable changes.txt files in the 2.7.0 branch.
          Hide
          aw Allen Wittenauer added a comment - - edited

          That isn't quite how the current relnotes.py code works. When you provide a previousVer or let it autodecrement to figure out what the previous release was, it puts at the top of the file "Changes since release [previous version]:" ... and that's it. It doesn't do anything fancy like print a separate section for every release it can calculate.

          This code base doesn't even do that. It jettisons the whole "since release x" thing. I take the view that our changes files is way too large on a per release basis vs. what we did pre-1.x. Putting multiple releases in a single file is too much information at one time.

          So if that line is removed and we don't need to know what the previous release was, there's no reason to even try to parse the version number, except for maybe sorting purposes. It really is mostly useless code.

          (FWIW: This does mean that x.0.0 releases present some special challenges for both the old and new code. If 2.8.0 is declared dead, then when 3.0.0 is built, it needs to get passed both 3.0.0 and 2.8.0 to merge all the changes together. Removing 2.8.0 from JIRA doesn't work because branch-2 does have those changes committed. The only other way to do this is to set the fix version to include both, similarly to how one would do it for multiple, concurrent releases. Luckily, this should only happen a few times, at the most, and are clearly an edge case.)

          Yes, I've spent too much time studying and thinking about these issues over the past few months. lol

          Show
          aw Allen Wittenauer added a comment - - edited That isn't quite how the current relnotes.py code works. When you provide a previousVer or let it autodecrement to figure out what the previous release was, it puts at the top of the file "Changes since release [previous version] :" ... and that's it. It doesn't do anything fancy like print a separate section for every release it can calculate. This code base doesn't even do that. It jettisons the whole "since release x" thing. I take the view that our changes files is way too large on a per release basis vs. what we did pre-1.x. Putting multiple releases in a single file is too much information at one time. So if that line is removed and we don't need to know what the previous release was, there's no reason to even try to parse the version number, except for maybe sorting purposes. It really is mostly useless code. (FWIW: This does mean that x.0.0 releases present some special challenges for both the old and new code. If 2.8.0 is declared dead, then when 3.0.0 is built, it needs to get passed both 3.0.0 and 2.8.0 to merge all the changes together. Removing 2.8.0 from JIRA doesn't work because branch-2 does have those changes committed. The only other way to do this is to set the fix version to include both, similarly to how one would do it for multiple, concurrent releases. Luckily, this should only happen a few times, at the most, and are clearly an edge case.) Yes, I've spent too much time studying and thinking about these issues over the past few months. lol
          Hide
          cmccabe Colin P. McCabe added a comment -

          I agree that recently we haven't done as many point releases as we should (or as many releases of any kind, for that matter). But we have done them quite often in the past... see 2.5.1, 2.5.2, 2.4.1. Code to parse point releases doesn't seem like dead code to me. On the other hand, code to parse "branch-1-win" and "trunk-win" does seem like deadcode since those are no longer being developed (and really, we never do Apache releases from feature branches anyway).

          I think people would be upset if we removed CHANGES.txt and there were big discrepancies between the existing CHANGES.txt and what this script generated. So I guess we will have to hold off on the removal until we can verify that. But I think committing the script makes sense, so I am +1 on that.

          Show
          cmccabe Colin P. McCabe added a comment - I agree that recently we haven't done as many point releases as we should (or as many releases of any kind, for that matter). But we have done them quite often in the past... see 2.5.1, 2.5.2, 2.4.1. Code to parse point releases doesn't seem like dead code to me. On the other hand, code to parse "branch-1-win" and "trunk-win" does seem like deadcode since those are no longer being developed (and really, we never do Apache releases from feature branches anyway). I think people would be upset if we removed CHANGES.txt and there were big discrepancies between the existing CHANGES.txt and what this script generated. So I guess we will have to hold off on the removal until we can verify that. But I think committing the script makes sense, so I am +1 on that.
          Hide
          aw Allen Wittenauer added a comment -

          Do you think it would be helpful to have a log message about the versions that can't be parsed?

          Not really. In fact, that's mostly dead code now. It was previously used to try and guess what the previous release was. Needless to say, it was a broken concept since we rarely do point releases. When I'm less distracted with other things, maybe I'll remove it.

          Did you confirm that the CHANGES.txt generated for 2.7 is the same as the current one in branch-2.7?

          I haven't, mainly because I don't spend any time in branch-2. I'm not sure I'd trust CHANGES.TXT in any branch really, given the number of mistakes I found in trunk's file and some of the other changes.txt files I did look at. (3-way comparison, jira vs. git vs. file, and even then I'm sure I didn't find everything wrong).

          Show
          aw Allen Wittenauer added a comment - Do you think it would be helpful to have a log message about the versions that can't be parsed? Not really. In fact, that's mostly dead code now. It was previously used to try and guess what the previous release was. Needless to say, it was a broken concept since we rarely do point releases. When I'm less distracted with other things, maybe I'll remove it. Did you confirm that the CHANGES.txt generated for 2.7 is the same as the current one in branch-2.7? I haven't, mainly because I don't spend any time in branch-2. I'm not sure I'd trust CHANGES.TXT in any branch really, given the number of mistakes I found in trunk's file and some of the other changes.txt files I did look at. (3-way comparison, jira vs. git vs. file, and even then I'm sure I didn't find everything wrong).
          Hide
          cmccabe Colin P. McCabe added a comment -

          Thanks, Allen. This looks good.

          Nope, otherwise releasedocmaker.py --version trunk-win would fail.

          OK. I guess we can revisit this later if it becomes an issue. Do you think it would be helpful to have a log message about the versions that can't be parsed?

          Did you confirm that the CHANGES.txt generated for 2.7 is the same as the current one in branch-2.7? +1 once that is confirmed

          We may have to fiddle with some things in JIRA if anything was incorrectly resolved (with the wrong version or something)

          Show
          cmccabe Colin P. McCabe added a comment - Thanks, Allen. This looks good. Nope, otherwise releasedocmaker.py --version trunk-win would fail. OK. I guess we can revisit this later if it becomes an issue. Do you think it would be helpful to have a log message about the versions that can't be parsed? Did you confirm that the CHANGES.txt generated for 2.7 is the same as the current one in branch-2.7? +1 once that is confirmed We may have to fiddle with some things in JIRA if anything was incorrectly resolved (with the wrong version or something)
          Hide
          hadoopqa Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12707952/HADOOP-11731-05.patch
          against trunk revision 3836ad6.

          +1 @author. The patch does not contain any @author tags.

          +0 tests included. The patch appears to be a documentation patch that doesn't require tests.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/6019//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/6019//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12707952/HADOOP-11731-05.patch against trunk revision 3836ad6. +1 @author . The patch does not contain any @author tags. +0 tests included . The patch appears to be a documentation patch that doesn't require tests. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/6019//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/6019//console This message is automatically generated.
          Hide
          aw Allen Wittenauer added a comment - - edited

          -05:

          • fix Colin's issues
          • reverse sort the index so newer on top
          • fix a few issues where some char weren't properly escaped which caused doxia to blow up on some of the older releases of Hadoop
          • change clean -Preleasedocs to remove the entire directory not just the files
          • rebase after HADOOP-11553 got committed
          Show
          aw Allen Wittenauer added a comment - - edited -05: fix Colin's issues reverse sort the index so newer on top fix a few issues where some char weren't properly escaped which caused doxia to blow up on some of the older releases of Hadoop change clean -Preleasedocs to remove the entire directory not just the files rebase after HADOOP-11553 got committed
          Hide
          aw Allen Wittenauer added a comment -

          Should we throw an exception if we can't parse the version?

          Nope, otherwise releasedocmaker.py --version trunk-win would fail.

          Show
          aw Allen Wittenauer added a comment - Should we throw an exception if we can't parse the version? Nope, otherwise releasedocmaker.py --version trunk-win would fail.
          Hide
          cmccabe Colin P. McCabe added a comment -

          Thank you for tackling this, Allen. It looks good.

          1	#!/usr/bin/python
          

          Should be #!/usr/bin/env python?

          2	#   Licensed under the Apache License, Version 2.0 (the "License");
          3	#   you may not use this file except in compliance with the License.
          4	#   You may obtain a copy of the License at
          5	#
          6	#       http://www.apache.org/licenses/LICENSE-2.0
          7	#
          8	#   Unless required by applicable law or agreed to in writing, software
          9	#   distributed under the License is distributed on an "AS IS" BASIS,
          10	#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
          11	#   See the License for the specific language governing permissions and
          12	#   limitations under the License.
          

          I realize that you are just copying this from the previous relnotes.py, but we should fix this to match our other license headers. If you look at determine-flaky-tests-hadoop.py, you can see its header is:

          # Licensed to the Apache Software Foundation (ASF) under one
          # or more contributor license agreements.  See the NOTICE file
          # distributed with this work for additional information
          # regarding copyright ownership.  The ASF licenses this file
          # to you under the Apache License, Version 2.0 (the
          # "License"); you may not use this file except in compliance
          # with the License.  You may obtain a copy of the License at
          #
          #     http://www.apache.org/licenses/LICENSE-2.0
          #
          # Unless required by applicable law or agreed to in writing, software
          # distributed under the License is distributed on an "AS IS" BASIS,
          # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
          # See the License for the specific language governing permissions and
          # limitations under the License.
          

          The text about the NOTICE file is missing from releasedocmaker.py.

          296	def main():
          297	  parser = OptionParser(usage="usage: %prog --version VERSION [--version VERSION2 ...]")
          298	  parser.add_option("-v", "--version", dest="versions",
          299	             action="append", type="string",
          300	             help="versions in JIRA to include in releasenotes", metavar="VERSION")
          301	  parser.add_option("-m","--master", dest="master", action="store_true",
          302	             help="only create the master files")
          303	  parser.add_option("-i","--index", dest="index", action="store_true",
          304	             help="build an index file")
          

          Can you add a note to the usage message about which files are generated by this script, and what their names will be? Also where the files will be generated

          80	    found = re.match('^((\d+)(\.\d+)*).*$', data)
          81	    if (found):
          82	      self.parts = [ int(p) for p in found.group(1).split('.') ]
          83	    else:
          84	      self.parts = []
          

          Should we throw an exception if we can't parse the version?

          28	def clean(str):
          29	  return clean(re.sub(namePattern, "", str))
          30	
          31	def formatComponents(str):
          32	  str = re.sub(namePattern, '', str).replace("'", "")
          33	  if str != "":
          34	    ret = str
          35	  else:
          36	    # some markdown parsers don't like empty tables
          37	    ret = "."
          38	  return clean(ret)
          39	
          40	def lessclean(str):
          41	  str=str.encode('utf-8')
          42	  str=str.replace("_","\_")
          43	  str=str.replace("\r","")
          44	  str=str.rstrip()
          45	  return str
          46	
          47	def clean(str):
          48	  str=lessclean(str)
          49	  str=str.replace("|","\|")
          50	  str=str.rstrip()
          

          I find this a bit confusing. Can we call the first function something other than "clean", to avoid having two different functions named "clean" that do different things? When would I use lessclean rather than clean? It seems like only the release notes get the lessclean treatment. It would be helpful to have a comments before the lessclean function explaining when it is useful.

          Show
          cmccabe Colin P. McCabe added a comment - Thank you for tackling this, Allen. It looks good. 1 #!/usr/bin/python Should be #!/usr/bin/env python ? 2 # Licensed under the Apache License, Version 2.0 (the "License" ); 3 # you may not use this file except in compliance with the License. 4 # You may obtain a copy of the License at 5 # 6 # http: //www.apache.org/licenses/LICENSE-2.0 7 # 8 # Unless required by applicable law or agreed to in writing, software 9 # distributed under the License is distributed on an "AS IS" BASIS, 10 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 11 # See the License for the specific language governing permissions and 12 # limitations under the License. I realize that you are just copying this from the previous relnotes.py , but we should fix this to match our other license headers. If you look at determine-flaky-tests-hadoop.py , you can see its header is: # Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this file # to you under the Apache License, Version 2.0 (the # "License" ); you may not use this file except in compliance # with the License. You may obtain a copy of the License at # # http: //www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. The text about the NOTICE file is missing from releasedocmaker.py . 296 def main(): 297 parser = OptionParser(usage= "usage: %prog --version VERSION [--version VERSION2 ...]" ) 298 parser.add_option( "-v" , "--version" , dest= "versions" , 299 action= "append" , type= "string" , 300 help= "versions in JIRA to include in releasenotes" , metavar= "VERSION" ) 301 parser.add_option( "-m" , "--master" , dest= "master" , action= "store_true" , 302 help= "only create the master files" ) 303 parser.add_option( "-i" , "--index" , dest= "index" , action= "store_true" , 304 help= "build an index file" ) Can you add a note to the usage message about which files are generated by this script, and what their names will be? Also where the files will be generated 80 found = re.match('^((\d+)(\.\d+)*).*$', data) 81 if (found): 82 self.parts = [ int (p) for p in found.group(1).split('.') ] 83 else : 84 self.parts = [] Should we throw an exception if we can't parse the version? 28 def clean(str): 29 return clean(re.sub(namePattern, "", str)) 30 31 def formatComponents(str): 32 str = re.sub(namePattern, '', str).replace( "'" , "") 33 if str != "": 34 ret = str 35 else : 36 # some markdown parsers don't like empty tables 37 ret = "." 38 return clean(ret) 39 40 def lessclean(str): 41 str=str.encode('utf-8') 42 str=str.replace( "_" , "\_" ) 43 str=str.replace( "\r" ,"") 44 str=str.rstrip() 45 return str 46 47 def clean(str): 48 str=lessclean(str) 49 str=str.replace( "|" , "\|" ) 50 str=str.rstrip() I find this a bit confusing. Can we call the first function something other than "clean", to avoid having two different functions named "clean" that do different things? When would I use lessclean rather than clean ? It seems like only the release notes get the lessclean treatment. It would be helpful to have a comments before the lessclean function explaining when it is useful.
          Hide
          hadoopqa Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12706981/HADOOP-11731-04.patch
          against trunk revision a16bfff.

          +1 @author. The patch does not contain any @author tags.

          +0 tests included. The patch appears to be a documentation patch that doesn't require tests.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5989//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5989//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12706981/HADOOP-11731-04.patch against trunk revision a16bfff. +1 @author . The patch does not contain any @author tags. +0 tests included . The patch appears to be a documentation patch that doesn't require tests. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5989//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5989//console This message is automatically generated.
          Hide
          aw Allen Wittenauer added a comment -

          Correct -04 patch this time.

          Show
          aw Allen Wittenauer added a comment - Correct -04 patch this time.
          Hide
          aw Allen Wittenauer added a comment -

          -04:

          • self generating indices
          • site index hook into that release index
          • -Preleasedocs (default is off) added to lower JIRA impact, offline still works (minus this), etc
          • building updated
          • release date detection for already released versions
          • some code cleanup as I learn more python
          Show
          aw Allen Wittenauer added a comment - -04: self generating indices site index hook into that release index -Preleasedocs (default is off) added to lower JIRA impact, offline still works (minus this), etc building updated release date detection for already released versions some code cleanup as I learn more python
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12706363/HADOOP-11731-03.patch
          against trunk revision 4335429.

          +1 @author. The patch does not contain any @author tags.

          +0 tests included. The patch appears to be a documentation patch that doesn't require tests.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The following test timeouts occurred in hadoop-common-project/hadoop-common:

          org.apache.hadoop.http.TestHttpServerLifecycle

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5981//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5981//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12706363/HADOOP-11731-03.patch against trunk revision 4335429. +1 @author . The patch does not contain any @author tags. +0 tests included . The patch appears to be a documentation patch that doesn't require tests. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The following test timeouts occurred in hadoop-common-project/hadoop-common: org.apache.hadoop.http.TestHttpServerLifecycle Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5981//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5981//console This message is automatically generated.
          Hide
          aw Allen Wittenauer added a comment -

          Making HADOOP-11553 as a conflicting patch since they touch the same pom.xml in the same place.

          Show
          aw Allen Wittenauer added a comment - Making HADOOP-11553 as a conflicting patch since they touch the same pom.xml in the same place.
          Hide
          aw Allen Wittenauer added a comment -

          -03:

          • only create master versions during build
          • tie them into site.xml
          • move the creation to hadoop-common
          Show
          aw Allen Wittenauer added a comment - -03: only create master versions during build tie them into site.xml move the creation to hadoop-common
          Hide
          hadoopqa Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12706187/HADOOP-11731-01.patch
          against trunk revision e1feb4e.

          +1 @author. The patch does not contain any @author tags.

          +0 tests included. The patch appears to be a documentation patch that doesn't require tests.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5980//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5980//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12706187/HADOOP-11731-01.patch against trunk revision e1feb4e. +1 @author . The patch does not contain any @author tags. +0 tests included . The patch appears to be a documentation patch that doesn't require tests. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5980//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5980//console This message is automatically generated.
          Hide
          aw Allen Wittenauer added a comment -

          -01:

          • Integrated into site documentation generation
          Show
          aw Allen Wittenauer added a comment - -01: Integrated into site documentation generation
          Hide
          aw Allen Wittenauer added a comment -

          -00:

          • here's a starting point.
          Show
          aw Allen Wittenauer added a comment - -00: here's a starting point.

            People

            • Assignee:
              aw Allen Wittenauer
              Reporter:
              aw Allen Wittenauer
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development