Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 0.15.0
    • Component/s: None
    • Labels:
      None

      Description

      This JIRA exists mainly to get a conversation started. We've talked about it before, and it's a tricky issue. That said, some of the Pig code is super, super gnarly. We need some sort of path that will let it eventually be fix-able.

      I posit: any file that hasn't been touched for over 6 months is eligible for a whitespace patch.

      1. checkstyle.xml
        3 kB
        Gianmarco De Francisci Morales

        Activity

        Jonathan Coveney created issue -
        Hide
        Gianmarco De Francisci Morales added a comment -

        I agree to fix whitespace, and the policy you suggest seems reasonable.
        However, this conversation belongs to the more general topic of style.
        Are we willing to define and enforce a code style policy for Pig?
        Other projects are using checkstyle to do so, so that new patches comply.
        I am fine with being stricter on style, but only for things that can be automated within an IDE (whitespace is one of those).

        Show
        Gianmarco De Francisci Morales added a comment - I agree to fix whitespace, and the policy you suggest seems reasonable. However, this conversation belongs to the more general topic of style. Are we willing to define and enforce a code style policy for Pig? Other projects are using checkstyle to do so, so that new patches comply. I am fine with being stricter on style, but only for things that can be automated within an IDE (whitespace is one of those).
        Hide
        Cheolsoo Park added a comment -

        I don't think that we should enforce a coding style, but we should definitely enforce an indent style:

        • No tabs.
        • No trailing white spaces.
        • 4 spaces to indent the code.

        In addition to Jon's suggestion for fixing up old files, we should also allow white space changes to a limited extend in any patch. For example, if I am touching a method in a file, I should be able to fix white spaces in that area of code that I'm touching regardless how long that file has not been modified. Fixing an entire file could be disruptive, but fixing parts of a file should be acceptable, no?

        Show
        Cheolsoo Park added a comment - I don't think that we should enforce a coding style, but we should definitely enforce an indent style: No tabs. No trailing white spaces. 4 spaces to indent the code. In addition to Jon's suggestion for fixing up old files, we should also allow white space changes to a limited extend in any patch. For example, if I am touching a method in a file, I should be able to fix white spaces in that area of code that I'm touching regardless how long that file has not been modified. Fixing an entire file could be disruptive, but fixing parts of a file should be acceptable, no?
        Hide
        Gianmarco De Francisci Morales added a comment -

        Agreed, let's start with this.
        Here a first stab for a checkstyle with FileTabCharacter, Indentation, and RegexpSingleLine for trailing white space.
        I threw in a few other things which I think are useful as well:
        Checking all files have Apache header, checking imports and using ==/!+ with Strings.

        Show
        Gianmarco De Francisci Morales added a comment - Agreed, let's start with this. Here a first stab for a checkstyle with FileTabCharacter, Indentation, and RegexpSingleLine for trailing white space. I threw in a few other things which I think are useful as well: Checking all files have Apache header, checking imports and using ==/!+ with Strings.
        Gianmarco De Francisci Morales made changes -
        Field Original Value New Value
        Attachment checkstyle.xml [ 12551147 ]
        Hide
        Julien Le Dem added a comment -

        I support a whitespace policy as defined by Cheolsoo above : No tabs, no trailing whitespace, 4 spaces. As RB as a "hide whitespace changes" feature we should just use it more.

        Show
        Julien Le Dem added a comment - I support a whitespace policy as defined by Cheolsoo above : No tabs, no trailing whitespace, 4 spaces. As RB as a "hide whitespace changes" feature we should just use it more.
        Hide
        Jonathan Coveney added a comment -

        I agree that we should make sure that all new code passes a checkstyle (and we should have a checkstyle.xml and we should enforce it). But when will be able to update old files?

        Show
        Jonathan Coveney added a comment - I agree that we should make sure that all new code passes a checkstyle (and we should have a checkstyle.xml and we should enforce it). But when will be able to update old files?
        Hide
        Prashant Kommireddi added a comment -
        Show
        Prashant Kommireddi added a comment - Here's a formatter that hbase project uses http://svn.apache.org/repos/asf/hbase/trunk/dev-support/hbase_eclipse_formatter.xml
        Hide
        Daniel Dai added a comment -

        Agree with Cheolsoo, we shall start with a limited scope. A giant patch would will make code tracing (svn blame) harder. Adding to the list:

        • Use Unix new line style
        Show
        Daniel Dai added a comment - Agree with Cheolsoo, we shall start with a limited scope. A giant patch would will make code tracing (svn blame) harder. Adding to the list: Use Unix new line style
        Hide
        Jonathan Coveney added a comment -

        2 questions:

        1. Is there a way to run checkstyle just on the files you changed?
        2. Is there a way to make an IDE (or just the commandline) fix the indents for you? I can do it by hand but some if it is really bad and would be tedious to fix.

        I like the idea that if you touch a file for other reasons, then we can automate the flow to try and get the whitespace formatting good.

        Show
        Jonathan Coveney added a comment - 2 questions: 1. Is there a way to run checkstyle just on the files you changed? 2. Is there a way to make an IDE (or just the commandline) fix the indents for you? I can do it by hand but some if it is really bad and would be tedious to fix. I like the idea that if you touch a file for other reasons, then we can automate the flow to try and get the whitespace formatting good.
        Hide
        Gianmarco De Francisci Morales added a comment -

        1. I am assuming we will integrate this with Ant. The checkstyle ant target wants a list of file to run on. So if you are able to identify the list of files you changed automatically, then yes. How do you define changed?
        2. Yes, as long as there is a compatible formatter profile (i.e. it is not too complex). For our use case it should be fine. I know it can be done with Eclipse. For other IDEs I guess it can be done as well but don't know how.

        We also need integration with Jenkins for automatic builds.

        Show
        Gianmarco De Francisci Morales added a comment - 1. I am assuming we will integrate this with Ant. The checkstyle ant target wants a list of file to run on. So if you are able to identify the list of files you changed automatically, then yes. How do you define changed? 2. Yes, as long as there is a compatible formatter profile (i.e. it is not too complex). For our use case it should be fine. I know it can be done with Eclipse. For other IDEs I guess it can be done as well but don't know how. We also need integration with Jenkins for automatic builds.
        Hide
        Jonathan Coveney added a comment -

        re 1. I defined changed in the git sense (or the svn sense). Files that you touched and will be in the diff. Obviously we need a way to exclude files (or ignore checksyle altogether)

        Show
        Jonathan Coveney added a comment - re 1. I defined changed in the git sense (or the svn sense). Files that you touched and will be in the diff. Obviously we need a way to exclude files (or ignore checksyle altogether)
        Hide
        Alan Gates added a comment -

        A couple of notes:

        We already have coding standards, see https://cwiki.apache.org/confluence/display/PIG/HowToContribute the section on Making Changes specifies that we use Sun's coding standards with 4 space indention.

        we should also allow white space changes to a limited extend in any patch. For example, if I am touching a method in a file, I should be able to fix white spaces in that area of code that I'm touching regardless how long that file has not been modified. Fixing an entire file could be disruptive, but fixing parts of a file should be acceptable, no?

        I disagree. This makes it really hard to understand real vs white space changes when reviewing the patch. Now the reviewer has to re-read the entire method and figure out what actually changed. I'm all for cleaning up white space, but let's do it in separate patches.

        Show
        Alan Gates added a comment - A couple of notes: We already have coding standards, see https://cwiki.apache.org/confluence/display/PIG/HowToContribute the section on Making Changes specifies that we use Sun's coding standards with 4 space indention. we should also allow white space changes to a limited extend in any patch. For example, if I am touching a method in a file, I should be able to fix white spaces in that area of code that I'm touching regardless how long that file has not been modified. Fixing an entire file could be disruptive, but fixing parts of a file should be acceptable, no? I disagree. This makes it really hard to understand real vs white space changes when reviewing the patch. Now the reviewer has to re-read the entire method and figure out what actually changed. I'm all for cleaning up white space, but let's do it in separate patches.
        Hide
        Cheolsoo Park added a comment -

        I appreciate your comments Alan.

        1) I am wondering if we could use the RB more often since it has an option to hide white space changes. It might make sense to ask contributors to upload their patches when they include white space changes.

        2) I think that there are certain times when including both white space and code changes makes more sense. Here is an example: https://reviews.apache.org/r/7734/diff/1/#4.16. Jon is removing a try-catch block and making some code changes at the same time. If indentations are not fixed, the net result looks really odd. In this case, wouldn't it better to include white space changes?

        Thanks!

        Show
        Cheolsoo Park added a comment - I appreciate your comments Alan. 1) I am wondering if we could use the RB more often since it has an option to hide white space changes. It might make sense to ask contributors to upload their patches when they include white space changes. 2) I think that there are certain times when including both white space and code changes makes more sense. Here is an example: https://reviews.apache.org/r/7734/diff/1/#4.16 . Jon is removing a try-catch block and making some code changes at the same time. If indentations are not fixed, the net result looks really odd. In this case, wouldn't it better to include white space changes? Thanks!
        Hide
        Alan Gates added a comment -

        I am wondering if we could use the RB more often since it has an option to hide white space changes. It might make sense to ask contributors to upload their patches when they include white space changes.

        Seems reasonable.

        I think that there are certain times when including both white space and code changes makes more sense. ...

        Sure, I am always in favor of common sense over hard and fast rules.

        Show
        Alan Gates added a comment - I am wondering if we could use the RB more often since it has an option to hide white space changes. It might make sense to ask contributors to upload their patches when they include white space changes. Seems reasonable. I think that there are certain times when including both white space and code changes makes more sense. ... Sure, I am always in favor of common sense over hard and fast rules.
        Daniel Dai made changes -
        Fix Version/s 0.13.0 [ 12324971 ]
        Fix Version/s 0.12.0 [ 12323380 ]
        Aniket Mokashi made changes -
        Fix Version/s 0.14.0 [ 12326954 ]
        Fix Version/s 0.13.0 [ 12324971 ]
        Daniel Dai made changes -
        Fix Version/s 0.15.0 [ 12328760 ]
        Fix Version/s 0.14.0 [ 12326954 ]

          People

          • Assignee:
            Unassigned
            Reporter:
            Jonathan Coveney
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:

              Development