Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-11906

test-patch.sh should use 'file' command for patch determinism

    Details

    • Type: Test
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: None
    • Labels:
      None

      Description

      test-patch.sh currently restricts patches to the extension .patch. It might be useful to also check if the file command says it is a diff. This would allow us to determine if files that end in .txt are actually patches.

      1. HADOOP-11906.1.patch
        6 kB
        Sean Busbey
      2. HADOOP-11906.2.patch
        5 kB
        Sean Busbey

        Activity

        Hide
        aw Allen Wittenauer added a comment -

        Some notes here. If a file doesn't end in .patch, then we should assume the person is a new or confused contributor and give them some pointers for help:

        • Part of HADOOP-11906 should probably be a mild scolding pointer to the naming standard for patches
        • The branch detection rules should be disabled
        Show
        aw Allen Wittenauer added a comment - Some notes here. If a file doesn't end in .patch, then we should assume the person is a new or confused contributor and give them some pointers for help: Part of HADOOP-11906 should probably be a mild scolding pointer to the naming standard for patches The branch detection rules should be disabled
        Hide
        aw Allen Wittenauer added a comment -

        After doing a bit of playing, two things:

        • file tends to output 'diff' on most platforms I've tested on
        • git format-patch files show up as ascii text due to the header.

        So we might need to use two checks, file and a grep for diff. It's also worthwhile noting that it's possible to generate a patch that smart-apply-patch can take but will confuse the hell out of test-patch.sh. For example:

        using git diff:

        diff --git a/start-build-env.sh b/start-build-env.sh
        old mode 100644
        new mode 100755
        

        start-build-env.sh is not in the changed files list so shellcheck does not get triggered.

        Show
        aw Allen Wittenauer added a comment - After doing a bit of playing, two things: file tends to output 'diff' on most platforms I've tested on git format-patch files show up as ascii text due to the header. So we might need to use two checks, file and a grep for diff. It's also worthwhile noting that it's possible to generate a patch that smart-apply-patch can take but will confuse the hell out of test-patch.sh. For example: using git diff: diff --git a/start-build-env.sh b/start-build-env.sh old mode 100644 new mode 100755 start-build-env.sh is not in the changed files list so shellcheck does not get triggered.
        Hide
        aw Allen Wittenauer added a comment -

        Filed HADOOP-11914 to cover the mode change patch.

        Show
        aw Allen Wittenauer added a comment - Filed HADOOP-11914 to cover the mode change patch.
        Hide
        cmccabe Colin P. McCabe added a comment -

        GNU file (aka libmagic) had some security vulnerabilities. A little googling turns up CVE-2014-2270 and CVE-2012-1571. I'd be wary of running it on untrusted input. Perhaps we could use something like the new BSD file implementation? http://marc.info/?l=openbsd-cvs&m=142989267412968&w=2

        Show
        cmccabe Colin P. McCabe added a comment - GNU file (aka libmagic) had some security vulnerabilities. A little googling turns up CVE-2014-2270 and CVE-2012-1571. I'd be wary of running it on untrusted input. Perhaps we could use something like the new BSD file implementation? http://marc.info/?l=openbsd-cvs&m=142989267412968&w=2
        Hide
        busbey Sean Busbey added a comment -

        We're already executing arbitrary changes to the maven pom, which can easily call arbitrary shell commands. I'd say vulnerabilities in the GNU file command are obviated by the security concerns inherent in what we're already doing.

        Show
        busbey Sean Busbey added a comment - We're already executing arbitrary changes to the maven pom, which can easily call arbitrary shell commands. I'd say vulnerabilities in the GNU file command are obviated by the security concerns inherent in what we're already doing.
        Hide
        aw Allen Wittenauer added a comment -

        That. Once I get test-patch.sh launching itself inside a Docker container, we can have a better a much more nuanced discussion around what security should be in place (esp since we'll able to control what binaries are actually being used). But right now, it's kind of moot.

        Show
        aw Allen Wittenauer added a comment - That. Once I get test-patch.sh launching itself inside a Docker container, we can have a better a much more nuanced discussion around what security should be in place (esp since we'll able to control what binaries are actually being used). But right now, it's kind of moot.
        Hide
        busbey Sean Busbey added a comment -

        Let's hope 'file' on jenkins does better than the on on OS X 10.8

        $ file -version
        file-5.04
        magic file from /usr/share/file/magic
        $ find ../patches/ -type f -exec file {} \; | cut -d: -f2 | sort | uniq -c | sort -n -r
         103  ASCII English text
          82  diff output text
          53  exported SGML document text
          50  ASCII C++ program text
          37  ASCII Java program text
          27  RCS/CVS diff output text
          23  HTML document text
          23  ASCII text
          10  ASCII English text, with very long lines
           7  UTF-8 Unicode English text
           3  ASCII C++ program text, with very long lines
           2  data
           2  PDF document, version 1.4
           2  ASCII Java program text, with very long lines
           1  UTF-8 Unicode English text, with very long lines
           1  Git bundle
           1  ASCII text, with very long lines
           1  ASCII C++ program text, with CRLF, CR, LF line terminators
        
        Show
        busbey Sean Busbey added a comment - Let's hope 'file' on jenkins does better than the on on OS X 10.8 $ file -version file-5.04 magic file from /usr/share/file/magic $ find ../patches/ -type f -exec file {} \; | cut -d: -f2 | sort | uniq -c | sort -n -r 103 ASCII English text 82 diff output text 53 exported SGML document text 50 ASCII C++ program text 37 ASCII Java program text 27 RCS/CVS diff output text 23 HTML document text 23 ASCII text 10 ASCII English text, with very long lines 7 UTF-8 Unicode English text 3 ASCII C++ program text, with very long lines 2 data 2 PDF document, version 1.4 2 ASCII Java program text, with very long lines 1 UTF-8 Unicode English text, with very long lines 1 Git bundle 1 ASCII text, with very long lines 1 ASCII C++ program text, with CRLF, CR, LF line terminators
        Hide
        busbey Sean Busbey added a comment -

        For comparison, if I use the heuristic "does the first line look like a patch?" I get 417 out of 428:

        $ find ../patches/ -type f -exec head -n 1 {} \; | grep -E "^(From [a-z0-9]* Mon Sep 17 00:00:00 2001)|(diff .*)|(Index: .*)$" | wc -l
             417
        $ find ../patches/ -type f -exec head -n 1 {} \; | grep -v -E "^(From [a-z0-9]* Mon Sep 17 00:00:00 2001)|(diff .*)|(Index: .*)$" | wc -l
              11
        
        Show
        busbey Sean Busbey added a comment - For comparison, if I use the heuristic "does the first line look like a patch?" I get 417 out of 428: $ find ../patches/ -type f -exec head -n 1 {} \; | grep -E "^(From [a-z0-9]* Mon Sep 17 00:00:00 2001)|(diff .*)|(Index: .*)$" | wc -l 417 $ find ../patches/ -type f -exec head -n 1 {} \; | grep -v -E "^(From [a-z0-9]* Mon Sep 17 00:00:00 2001)|(diff .*)|(Index: .*)$" | wc -l 11
        Hide
        busbey Sean Busbey added a comment -

        -01

        • "why not both?" - use file and fall back to first-line parsing.
        • If we think it's a patch but it's named wrong, give a warning on the jira and point them at the HowToContribute guide
        Show
        busbey Sean Busbey added a comment - -01 "why not both?" - use file and fall back to first-line parsing. If we think it's a patch but it's named wrong, give a warning on the jira and point them at the HowToContribute guide
        Hide
        aw Allen Wittenauer added a comment -

        Fantastic. I threw it at all the patches that the current version had issues with and it appears to do the correct thing. One nit: I don't think it's worth the effort to make head replaceable since it doesn't have very many variants that don't understand our syntax.

        Show
        aw Allen Wittenauer added a comment - Fantastic. I threw it at all the patches that the current version had issues with and it appears to do the correct thing. One nit: I don't think it's worth the effort to make head replaceable since it doesn't have very many variants that don't understand our syntax.
        Hide
        hadoopqa Hadoop QA added a comment -

        A patch to test-patch or smart-apply-patch has been detected.
        Re-executing against the patched versions to perform further tests.
        The console is at https://builds.apache.org/job/PreCommit-HADOOP-Build/6551/console in case of problems.

        Show
        hadoopqa Hadoop QA added a comment - A patch to test-patch or smart-apply-patch has been detected. Re-executing against the patched versions to perform further tests. The console is at https://builds.apache.org/job/PreCommit-HADOOP-Build/6551/console in case of problems.
        Hide
        hadoopqa Hadoop QA added a comment -



        +1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 0s dev-support patch detected.
        0 pre-patch 0m 0s Pre-patch trunk compilation is healthy.
        0 @author 0m 0s Skipping @author checks as test-patch has been patched.
        +1 release audit 0m 15s The applied patch does not increase the total number of release audit warnings.
        +1 shellcheck 0m 5s There were no new shellcheck (v0.3.3) issues.
        +1 whitespace 0m 1s The patch has no lines that end in whitespace.
            0m 24s  



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12731503/HADOOP-11906.1.patch
        Optional Tests shellcheck
        git revision trunk / 8f7c236
        Java 1.7.0_55
        uname Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/6551/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s dev-support patch detected. 0 pre-patch 0m 0s Pre-patch trunk compilation is healthy. 0 @author 0m 0s Skipping @author checks as test-patch has been patched. +1 release audit 0m 15s The applied patch does not increase the total number of release audit warnings. +1 shellcheck 0m 5s There were no new shellcheck (v0.3.3) issues. +1 whitespace 0m 1s The patch has no lines that end in whitespace.     0m 24s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12731503/HADOOP-11906.1.patch Optional Tests shellcheck git revision trunk / 8f7c236 Java 1.7.0_55 uname Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/6551/console This message was automatically generated.
        Hide
        busbey Sean Busbey added a comment -

        -02

        • remove configurable 'head' command
        Show
        busbey Sean Busbey added a comment - -02 remove configurable 'head' command
        Hide
        cmccabe Colin P. McCabe added a comment -

        That's a fair point. We should talk more about how to sandbox this stuff. Something like Xen and using a clean image each time would be the best... but maybe docker is the best we can get.

        Show
        cmccabe Colin P. McCabe added a comment - That's a fair point. We should talk more about how to sandbox this stuff. Something like Xen and using a clean image each time would be the best... but maybe docker is the best we can get.
        Hide
        aw Allen Wittenauer added a comment -

        +1 committed to trunk and that other branch.

        Show
        aw Allen Wittenauer added a comment - +1 committed to trunk and that other branch.
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-trunk-Commit #7779 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7779/)
        HADOOP-11906. test-patch.sh should use file command for patch determinism (Sean Busbey via aw) (aw: rev effcc5cbb717fe31eb8d64e505f57bc2786399ff)

        • hadoop-common-project/hadoop-common/CHANGES.txt
        • dev-support/test-patch.sh
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #7779 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7779/ ) HADOOP-11906 . test-patch.sh should use file command for patch determinism (Sean Busbey via aw) (aw: rev effcc5cbb717fe31eb8d64e505f57bc2786399ff) hadoop-common-project/hadoop-common/CHANGES.txt dev-support/test-patch.sh
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #191 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/191/)
        HADOOP-11906. test-patch.sh should use file command for patch determinism (Sean Busbey via aw) (aw: rev effcc5cbb717fe31eb8d64e505f57bc2786399ff)

        • hadoop-common-project/hadoop-common/CHANGES.txt
        • dev-support/test-patch.sh
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #191 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/191/ ) HADOOP-11906 . test-patch.sh should use file command for patch determinism (Sean Busbey via aw) (aw: rev effcc5cbb717fe31eb8d64e505f57bc2786399ff) hadoop-common-project/hadoop-common/CHANGES.txt dev-support/test-patch.sh
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Yarn-trunk #922 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/922/)
        HADOOP-11906. test-patch.sh should use file command for patch determinism (Sean Busbey via aw) (aw: rev effcc5cbb717fe31eb8d64e505f57bc2786399ff)

        • hadoop-common-project/hadoop-common/CHANGES.txt
        • dev-support/test-patch.sh
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk #922 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/922/ ) HADOOP-11906 . test-patch.sh should use file command for patch determinism (Sean Busbey via aw) (aw: rev effcc5cbb717fe31eb8d64e505f57bc2786399ff) hadoop-common-project/hadoop-common/CHANGES.txt dev-support/test-patch.sh
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk #2120 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2120/)
        HADOOP-11906. test-patch.sh should use file command for patch determinism (Sean Busbey via aw) (aw: rev effcc5cbb717fe31eb8d64e505f57bc2786399ff)

        • dev-support/test-patch.sh
        • hadoop-common-project/hadoop-common/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2120 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2120/ ) HADOOP-11906 . test-patch.sh should use file command for patch determinism (Sean Busbey via aw) (aw: rev effcc5cbb717fe31eb8d64e505f57bc2786399ff) dev-support/test-patch.sh hadoop-common-project/hadoop-common/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #180 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/180/)
        HADOOP-11906. test-patch.sh should use file command for patch determinism (Sean Busbey via aw) (aw: rev effcc5cbb717fe31eb8d64e505f57bc2786399ff)

        • dev-support/test-patch.sh
        • hadoop-common-project/hadoop-common/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #180 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/180/ ) HADOOP-11906 . test-patch.sh should use file command for patch determinism (Sean Busbey via aw) (aw: rev effcc5cbb717fe31eb8d64e505f57bc2786399ff) dev-support/test-patch.sh hadoop-common-project/hadoop-common/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #190 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/190/)
        HADOOP-11906. test-patch.sh should use file command for patch determinism (Sean Busbey via aw) (aw: rev effcc5cbb717fe31eb8d64e505f57bc2786399ff)

        • dev-support/test-patch.sh
        • hadoop-common-project/hadoop-common/CHANGES.txt
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #190 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/190/ ) HADOOP-11906 . test-patch.sh should use file command for patch determinism (Sean Busbey via aw) (aw: rev effcc5cbb717fe31eb8d64e505f57bc2786399ff) dev-support/test-patch.sh hadoop-common-project/hadoop-common/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2138 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2138/)
        HADOOP-11906. test-patch.sh should use file command for patch determinism (Sean Busbey via aw) (aw: rev effcc5cbb717fe31eb8d64e505f57bc2786399ff)

        • dev-support/test-patch.sh
        • hadoop-common-project/hadoop-common/CHANGES.txt
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2138 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2138/ ) HADOOP-11906 . test-patch.sh should use file command for patch determinism (Sean Busbey via aw) (aw: rev effcc5cbb717fe31eb8d64e505f57bc2786399ff) dev-support/test-patch.sh hadoop-common-project/hadoop-common/CHANGES.txt

          People

          • Assignee:
            busbey Sean Busbey
            Reporter:
            aw Allen Wittenauer
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development