Hadoop Common
  1. Hadoop Common
  2. HADOOP-8248

Clarify bylaws about review-then-commit policy

    Details

    • Type: Task Task
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      As discussed on the mailing list (thread "Requirements for patch review" 4/4/2012) we should clarify the bylaws with respect to the review-then-commit policy. This JIRA is to agree on the proposed change.

      1. c8248_20120409.patch
        1 kB
        Tsz Wo Nicholas Sze
      2. proposed-bylaw-change.txt
        2 kB
        Todd Lipcon

        Activity

        Hide
        Todd Lipcon added a comment -

        Here is the proposed change as a patch. Copy-pasting inline for easy reading:

                 <li> <strong>Code Change</strong>
        
                    <p>A change made to a codebase of the project and committed
                    by a committer. This includes source code, documentation, website
                    content, etc.</p>
        
                    <p>Such changes require lazy consensus of active committers,
                    but with a minimum of one +1. The +1 must be provided by a committer
                    who is not a primary author of the patch.
                    The code can be committed after the first +1, unless
                    the code change represents a merge from a branch, in which case
                    three +1s are required.</p>
        
                    <p>In the case of trivial changes, a committer may commit after a
                    non-binding +1 vote from a contributor or the issue reporter. The
                    definition of what constitutes a trivial change is subject to the
                    best judgment of the committer, but should generally be limited
                    to changes such as: documentation fixes, spelling mistakes, log
                    message changes, or the addition of new test code.</p>
        
                    <p>In the case of feature branches, the creator of the branch
                    may elect to forego this review-then-commit policy on that branch,
                    with the understanding that three committers must provide a +1
                    vote before its eventual merge as described above.</p>
        
              </li>
        
        Show
        Todd Lipcon added a comment - Here is the proposed change as a patch. Copy-pasting inline for easy reading: <li> <strong>Code Change</strong> <p>A change made to a codebase of the project and committed by a committer. This includes source code, documentation, website content, etc.</p> <p>Such changes require lazy consensus of active committers, but with a minimum of one +1. The +1 must be provided by a committer who is not a primary author of the patch. The code can be committed after the first +1, unless the code change represents a merge from a branch, in which case three +1s are required.</p> <p>In the case of trivial changes, a committer may commit after a non-binding +1 vote from a contributor or the issue reporter. The definition of what constitutes a trivial change is subject to the best judgment of the committer, but should generally be limited to changes such as: documentation fixes, spelling mistakes, log message changes, or the addition of new test code.</p> <p>In the case of feature branches, the creator of the branch may elect to forego this review-then-commit policy on that branch, with the understanding that three committers must provide a +1 vote before its eventual merge as described above.</p> </li>
        Hide
        Tsz Wo Nicholas Sze added a comment -

        > As discussed on the mailing list (thread "Requirements for patch review" 4/4/2012) ...

        The discussion was just started. Are you in hurry to change the bylaws?

        Show
        Tsz Wo Nicholas Sze added a comment - > As discussed on the mailing list (thread "Requirements for patch review" 4/4/2012) ... The discussion was just started. Are you in hurry to change the bylaws?
        Hide
        Todd Lipcon added a comment -

        Nope, no hurry at all. Eli suggested I post a proposed amendment as a patch on JIRA, so that's what I did.

        FWIW, I don't see this as a change in bylaws - just a clarification of existing language. If you don't think the proposed patch is right, feel free to disagree - I just put this proposal up as something to start discussion. Happy to change my opinion.

        Show
        Todd Lipcon added a comment - Nope, no hurry at all. Eli suggested I post a proposed amendment as a patch on JIRA, so that's what I did. FWIW, I don't see this as a change in bylaws - just a clarification of existing language. If you don't think the proposed patch is right, feel free to disagree - I just put this proposal up as something to start discussion. Happy to change my opinion.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Since thare are an on-going mailing discussion thead and a JIRA, it is better to also post my email reply here.


        I agree that the bylaws is not clear about this. For reviewing patches, my understanding is that any contributor, a committer or not, could review patches and the +1 counts. I have worked on Hadoop almost five years. This is what we are doing for a long time (if it is not from the beginning of the Hadoop project.) Could other people confirm this?

        From the HowToContribute wiki, it does advise committers to find another committer to review difficult patches: "Committers: for non-trivial changes, it is best to get another committer to review your patches before commit. ..." It seems saying that it is okay for non-committers reviewing simple and medium patches. Todd's amendments use different wording which seems implying a different requirement: the +1's from non-committers could be counted only for simple patches but not medium and difficult patches.

        I think we should keep allowing everyone to review patches. It slows down the development and is discouraging if non-committer's +1 does not count. I believe the judgement of the committer who commits the patch won't commit bad code. We have svn and we could revert patches if necessary. Lastly, if a committer keeps committing bad code, we could exercise "Committer Removal".

        BTW, does anyone know what other Apache projects do?

        Show
        Tsz Wo Nicholas Sze added a comment - Since thare are an on-going mailing discussion thead and a JIRA, it is better to also post my email reply here. I agree that the bylaws is not clear about this. For reviewing patches, my understanding is that any contributor, a committer or not, could review patches and the +1 counts. I have worked on Hadoop almost five years. This is what we are doing for a long time (if it is not from the beginning of the Hadoop project.) Could other people confirm this? From the HowToContribute wiki , it does advise committers to find another committer to review difficult patches: "Committers: for non-trivial changes, it is best to get another committer to review your patches before commit. ..." It seems saying that it is okay for non-committers reviewing simple and medium patches. Todd's amendments use different wording which seems implying a different requirement: the +1's from non-committers could be counted only for simple patches but not medium and difficult patches. I think we should keep allowing everyone to review patches. It slows down the development and is discouraging if non-committer's +1 does not count. I believe the judgement of the committer who commits the patch won't commit bad code. We have svn and we could revert patches if necessary. Lastly, if a committer keeps committing bad code, we could exercise "Committer Removal". BTW, does anyone know what other Apache projects do?
        Hide
        Doug Cutting added a comment -

        Different projects do different things. A lot of Apache projects are CTR; no explicit review is required before any commit. On Avro, if a patch doesn't get a review after a few days we'll often say something like, "I'll commit this tomorrow unless someone objects". So an opportunity for review before commit is provided, but a review is not required. But on Hadoop a +1 from a committer who is not the patch author has always been required for significant patches.

        Show
        Doug Cutting added a comment - Different projects do different things. A lot of Apache projects are CTR; no explicit review is required before any commit. On Avro, if a patch doesn't get a review after a few days we'll often say something like, "I'll commit this tomorrow unless someone objects". So an opportunity for review before commit is provided, but a review is not required. But on Hadoop a +1 from a committer who is not the patch author has always been required for significant patches.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        > ... But on Hadoop a +1 from a committer who is not the patch author has always been required for significant patches.

        Good to know that. I did not see anyone committing significant patches without a +1 from another committer in the past, except when the patch was a join work of multiple committers, but I did not know that it was a requirement.

        We should also clarify the following:

        • For a join work of multiple committers, all of the authors cannot review the patch for significant patches.
        • For merging from a branch, the three +1's cannot be cast from any of the committers who worked on the branch.

        Does it make sense? I will post proposed change.

        Show
        Tsz Wo Nicholas Sze added a comment - > ... But on Hadoop a +1 from a committer who is not the patch author has always been required for significant patches. Good to know that. I did not see anyone committing significant patches without a +1 from another committer in the past, except when the patch was a join work of multiple committers, but I did not know that it was a requirement. We should also clarify the following: For a join work of multiple committers, all of the authors cannot review the patch for significant patches. For merging from a branch, the three +1's cannot be cast from any of the committers who worked on the branch. Does it make sense? I will post proposed change.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        c8248_20120409.patch: my proposed change

        Index: main/author/src/documentation/content/xdocs/bylaws.xml
        ===================================================================
        --- main/author/src/documentation/content/xdocs/bylaws.xml	(revision 1311409)
        +++ main/author/src/documentation/content/xdocs/bylaws.xml	(working copy)
        @@ -244,10 +244,17 @@
                     content, etc.</p>
         
                     <p>Lazy consensus of active committers, but with a minimum of
        -            one +1. The code can be committed after the first +1, unless
        -            the code change represents a merge from a branch, in which case
        -            three +1s are required.</p></li>
        +            one +1. When the code change can be submitted by a patch, the code
        +            can be committed after the first +1.  The +1 can be cast from a
        +            reviewer who is not one of the authors of the patch. For
        +            significant patches, the reviewer must be a committer.  Otherwise,
        +            the reviewer can be any contributor.</p>
         
        +            <p>When code change is both large and significant, it must be either
        +            splitted into subtasks or first committed to a development branch.
        +            When the code change represents a merge from a branch, +1s from
        +            three committers who are not the authors are required.</p></li>
        +
                  <li> <strong>Release Plan</strong>
         
                     <p>Defines the timetable and actions for a release. The plan also
        
        Show
        Tsz Wo Nicholas Sze added a comment - c8248_20120409.patch: my proposed change Index: main/author/src/documentation/content/xdocs/bylaws.xml =================================================================== --- main/author/src/documentation/content/xdocs/bylaws.xml (revision 1311409) +++ main/author/src/documentation/content/xdocs/bylaws.xml (working copy) @@ -244,10 +244,17 @@ content, etc.</p> <p>Lazy consensus of active committers, but with a minimum of - one +1. The code can be committed after the first +1, unless - the code change represents a merge from a branch, in which case - three +1s are required.</p></li> + one +1. When the code change can be submitted by a patch, the code + can be committed after the first +1. The +1 can be cast from a + reviewer who is not one of the authors of the patch. For + significant patches, the reviewer must be a committer. Otherwise, + the reviewer can be any contributor.</p> + <p>When code change is both large and significant, it must be either + splitted into subtasks or first committed to a development branch. + When the code change represents a merge from a branch, +1s from + three committers who are not the authors are required.</p></li> + <li> <strong>Release Plan</strong> <p>Defines the timetable and actions for a release. The plan also
        Hide
        Todd Lipcon added a comment -

        For a join work of multiple committers, all of the authors cannot review the patch for significant patches.

        My thinking here is that it's fine if one committer does some minor fixup or adds test cases to a patch that another authored. For example, if I start a patch, but don't get time to finish the unit tests, and you help out by adding a test, I think it's OK for you to commit it assuming I +1 your addition. Put another way, any given "chunk" of the patch should be reviewed by a committer who didn't write it.

        I don't want to get too pedantic about it, though – IMO it's the spirit that's important. Code reviews are important for spotting mistakes, and it's hard to spot your own mistakes. So any piece of code should be +1ed at by an expert (ie committer) who didn't write that bit of code.

        For merging from a branch, the three +1's cannot be cast from any of the committers who worked on the branch.

        I disagree on this – my assumption is that all of the patches on the branch have been reviewed according to the above policy, so everything's been looked at by someone who didn't write it. In my mind, the +1s on the merge are basically a commitment to stand by the work to be merged and an assertion that you think it is good code, a good feature, etc. If the development on the branch looks shoddy/sketchy/whatever, then there's plenty of opportunity for other committers to -1 it.

        Perhaps we should add a 3-day minimum voting period for branch merges to trunk when that branch didn't follow the normal RTC guidelines?

        Show
        Todd Lipcon added a comment - For a join work of multiple committers, all of the authors cannot review the patch for significant patches. My thinking here is that it's fine if one committer does some minor fixup or adds test cases to a patch that another authored. For example, if I start a patch, but don't get time to finish the unit tests, and you help out by adding a test, I think it's OK for you to commit it assuming I +1 your addition. Put another way, any given "chunk" of the patch should be reviewed by a committer who didn't write it. I don't want to get too pedantic about it, though – IMO it's the spirit that's important. Code reviews are important for spotting mistakes, and it's hard to spot your own mistakes. So any piece of code should be +1ed at by an expert (ie committer) who didn't write that bit of code. For merging from a branch, the three +1's cannot be cast from any of the committers who worked on the branch. I disagree on this – my assumption is that all of the patches on the branch have been reviewed according to the above policy, so everything's been looked at by someone who didn't write it. In my mind, the +1s on the merge are basically a commitment to stand by the work to be merged and an assertion that you think it is good code, a good feature, etc. If the development on the branch looks shoddy/sketchy/whatever, then there's plenty of opportunity for other committers to -1 it. Perhaps we should add a 3-day minimum voting period for branch merges to trunk when that branch didn't follow the normal RTC guidelines?
        Hide
        Todd Lipcon added a comment -

        To add a little more: I think the requirement of 3 committer +1s from people who didn't work on the branch will make it really hard to ever merge branches. Looking, for example, at the recent HA branch merge, it listed the following people as patch contributors:

        Contributed by Todd Lipcon, Aaron T. Myers, Eli Collins, Uma Maheswara Rao G, Bikas Saha, Suresh Srinivas, Jitendra Nath Pandey, Hari Mankude, Brandon Li, Sanjay Radia, Mingjie Lai, and Gregory Chanan

        Finding 3 active committers who are not on that list and are knowledgeable about NN internals would have been very difficult. In fact of the committers who did +1 the merge, you're the only one who isn't in the above list

        Show
        Todd Lipcon added a comment - To add a little more: I think the requirement of 3 committer +1s from people who didn't work on the branch will make it really hard to ever merge branches. Looking, for example, at the recent HA branch merge, it listed the following people as patch contributors: Contributed by Todd Lipcon, Aaron T. Myers, Eli Collins, Uma Maheswara Rao G, Bikas Saha, Suresh Srinivas, Jitendra Nath Pandey, Hari Mankude, Brandon Li, Sanjay Radia, Mingjie Lai, and Gregory Chanan Finding 3 active committers who are not on that list and are knowledgeable about NN internals would have been very difficult. In fact of the committers who did +1 the merge, you're the only one who isn't in the above list
        Hide
        Jakob Homan added a comment -

        I disagree on this – my assumption is that all of the patches on the branch have been reviewed according to the above policy,

        Which above policy? Branches can use RTC, or whatever they decide upon. Therefore it is possible that the branch content has not actually been reviewed by another committer before merging.

        Show
        Jakob Homan added a comment - I disagree on this – my assumption is that all of the patches on the branch have been reviewed according to the above policy, Which above policy? Branches can use RTC, or whatever they decide upon. Therefore it is possible that the branch content has not actually been reviewed by another committer before merging.
        Hide
        Todd Lipcon added a comment -

        Which above policy? Branches can use RTC, or whatever they decide upon. Therefore it is possible that the branch content has not actually been reviewed by another committer before merging.

        Right, that's why I also added: Perhaps we should add a 3-day minimum voting period for branch merges to trunk when that branch didn't follow the normal RTC guidelines?

        Show
        Todd Lipcon added a comment - Which above policy? Branches can use RTC, or whatever they decide upon. Therefore it is possible that the branch content has not actually been reviewed by another committer before merging. Right, that's why I also added: Perhaps we should add a 3-day minimum voting period for branch merges to trunk when that branch didn't follow the normal RTC guidelines?
        Hide
        Tsz Wo Nicholas Sze added a comment -

        > My thinking here is that it's fine if one committer does some minor fixup or adds test cases to a patch that another authored. ...

        When the change is minor, I think the committer should not be counted as one of the authors. Otherwise, it looks like the committer is stealing credits from the contributor. How does it sound to you?

        > ... my assumption is that all of the patches on the branch have been reviewed according to the above policy, ...

        It is not true in some cases. In development branches, we might have commit-then-review; we might also commit warnings and fix them later; etc. The HA branch you brought up is one example.

        > ... Finding 3 active committers who are not on that list and are knowledgeable about NN internals ...

        In the HA example, although the list of contributor is slightly long but there are still many other committers available. I can easier think of three. If we drop "knowledgeable about NN internals", there are probably >20.

        BTW, I don't think "knowledgeable about NN internals" is a requirement according to the bylaws.

        For extreme cases that no other committer is available, it probably make sense to have a special discussion for requesting the PMC to relax the requirement for that merge.

        > Perhaps we should add a 3-day minimum voting period for branch merges to trunk when that branch didn't follow the normal RTC guidelines?

        Is it the case that it must be 7-day for any voting according to the "Voting Timeframes" in the bylaws?

        Show
        Tsz Wo Nicholas Sze added a comment - > My thinking here is that it's fine if one committer does some minor fixup or adds test cases to a patch that another authored. ... When the change is minor, I think the committer should not be counted as one of the authors. Otherwise, it looks like the committer is stealing credits from the contributor. How does it sound to you? > ... my assumption is that all of the patches on the branch have been reviewed according to the above policy, ... It is not true in some cases. In development branches, we might have commit-then-review; we might also commit warnings and fix them later; etc. The HA branch you brought up is one example. > ... Finding 3 active committers who are not on that list and are knowledgeable about NN internals ... In the HA example, although the list of contributor is slightly long but there are still many other committers available. I can easier think of three. If we drop "knowledgeable about NN internals", there are probably >20. BTW, I don't think "knowledgeable about NN internals" is a requirement according to the bylaws. For extreme cases that no other committer is available, it probably make sense to have a special discussion for requesting the PMC to relax the requirement for that merge. > Perhaps we should add a 3-day minimum voting period for branch merges to trunk when that branch didn't follow the normal RTC guidelines? Is it the case that it must be 7-day for any voting according to the "Voting Timeframes" in the bylaws?
        Hide
        Tsz Wo Nicholas Sze added a comment -

        > Which above policy? Branches can use RTC, or whatever they decide upon. ...

        I duplicated Jakob's point in my previous comment. We both agree that the assumption is invalid.

        Show
        Tsz Wo Nicholas Sze added a comment - > Which above policy? Branches can use RTC, or whatever they decide upon. ... I duplicated Jakob's point in my previous comment. We both agree that the assumption is invalid.
        Hide
        Robert Joseph Evans added a comment -

        > Perhaps we should add a 3-day minimum voting period for branch merges to trunk when that branch didn't follow the normal RTC guidelines?

        Is it the case that it must be 7-day for any voting according to the "Voting Timeframes" in the bylaws?

        To answer your question the bylaws state.

        Voting Timeframes

        Votes are open for a period of 7 days to allow all active voters time to consider the vote. Votes relating to code changes are not subject to a strict timetable but should be made as timely as possible.

        Show
        Robert Joseph Evans added a comment - > Perhaps we should add a 3-day minimum voting period for branch merges to trunk when that branch didn't follow the normal RTC guidelines? Is it the case that it must be 7-day for any voting according to the "Voting Timeframes" in the bylaws? To answer your question the bylaws state. Voting Timeframes Votes are open for a period of 7 days to allow all active voters time to consider the vote. Votes relating to code changes are not subject to a strict timetable but should be made as timely as possible.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Thanks Robert. The last sentence was somehow in my blind spot.

        Then, there could be another extreme case that three committers create a branch, work on it extensively, +1 on their own and then merge instantaneously without other people reviewing. I think we want to prevent this case.

        Show
        Tsz Wo Nicholas Sze added a comment - Thanks Robert. The last sentence was somehow in my blind spot. Then, there could be another extreme case that three committers create a branch, work on it extensively, +1 on their own and then merge instantaneously without other people reviewing. I think we want to prevent this case.
        Hide
        eric baldeschwieler added a comment -

        IMO the HEP (Hadoop Enhancement Process) discussion we had a while back was focused on a set of core principles I'd like us to consider again. Simple fixes and enhancements are easy, but for a major platform like Hadoop, I think we should have a community process to make sure that big changes are good changes. Before a big change goes into Hadoop, we want to know that it is complete, meaning:

        • There is an understanding of the use cases / needs the change is addressing
        • There is agreement that the change makes hadoop better
        • The work is done: Full design / Real Tests & executed test plan / Docs
        • It meets our API compatibility goals

        For big changes I'd like to see us focus on defining a process that makes sure the above points are discussed and everyone agrees that the work is complete. I feel like the bylaw changes discussed here are focused on procedure.

        A HELP proposal (Hadoop Enhancement Lightweight Process):

        • When a proposal seems complex, folks can request that a HELP is filed...
        • A doc is created in the subversion tree .../HELP/<JIRA-ID> of a companion JIRA
          (often the preexisting JIRA the issue was raised on)
        • Development can take place on a branch, via patches, whatever, but it is not committed until...
        • The HELP document is complete (there will be a HELP template with appropriate questions)
        • The help document outlines:
        • the need for the change
        • the use cases
        • An external spec of the change (EG man page / java doc)
        • An examination of API/compatibility issues
        • a vote on common-dev occurs blessing the HELP DOC
        • .../HELP/<JIRA-ID>.designdoc & .../HELP/<JIRA-ID>.testplan are complete
        • a vote on common-dev blesses the code, the design doc, the test plan, that the work is complete...
        • Beyond the votes on general to guarantee wide acceptance of the change all work can happen on the JIRA

        What do people think?

        Show
        eric baldeschwieler added a comment - IMO the HEP (Hadoop Enhancement Process) discussion we had a while back was focused on a set of core principles I'd like us to consider again. Simple fixes and enhancements are easy, but for a major platform like Hadoop, I think we should have a community process to make sure that big changes are good changes. Before a big change goes into Hadoop, we want to know that it is complete, meaning: There is an understanding of the use cases / needs the change is addressing There is agreement that the change makes hadoop better The work is done: Full design / Real Tests & executed test plan / Docs It meets our API compatibility goals For big changes I'd like to see us focus on defining a process that makes sure the above points are discussed and everyone agrees that the work is complete. I feel like the bylaw changes discussed here are focused on procedure. A HELP proposal (Hadoop Enhancement Lightweight Process): When a proposal seems complex, folks can request that a HELP is filed... A doc is created in the subversion tree .../HELP/<JIRA-ID> of a companion JIRA (often the preexisting JIRA the issue was raised on) Development can take place on a branch, via patches, whatever, but it is not committed until... The HELP document is complete (there will be a HELP template with appropriate questions) The help document outlines: the need for the change the use cases An external spec of the change (EG man page / java doc) An examination of API/compatibility issues a vote on common-dev occurs blessing the HELP DOC .../HELP/<JIRA-ID>.designdoc & .../HELP/<JIRA-ID>.testplan are complete a vote on common-dev blesses the code, the design doc, the test plan, that the work is complete... Beyond the votes on general to guarantee wide acceptance of the change all work can happen on the JIRA What do people think?
        Hide
        Owen O'Malley added a comment -

        Please take this entire conversation over to the common-dev list. Having process discussions on a jira will just get them missed.

        Show
        Owen O'Malley added a comment - Please take this entire conversation over to the common-dev list. Having process discussions on a jira will just get them missed.

          People

          • Assignee:
            Unassigned
            Reporter:
            Todd Lipcon
          • Votes:
            0 Vote for this issue
            Watchers:
            18 Start watching this issue

            Dates

            • Created:
              Updated:

              Development