Lucene - Core
  1. Lucene - Core
  2. LUCENE-4332

Integrate PiTest mutation coverage tool into build

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.1, 5.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
    • Lucene Fields:
      New, Patch Available

      Description

      As discussed briefly on the mailing list, this patch is an attempt to integrate the PiTest mutation coverage tool into the lucene build

        Issue Links

          Activity

          Greg Bowyer created issue -
          Greg Bowyer made changes -
          Field Original Value New Value
          Attachment LUCENE-4332-Integrate-PiTest-mutation-coverage-tool-into-build.patch [ 12542725 ]
          Hide
          Greg Bowyer added a comment -

          Corrected jcommander license

          Show
          Greg Bowyer added a comment - Corrected jcommander license
          Greg Bowyer made changes -
          Hide
          Uwe Schindler added a comment -

          What is the difference to run "ant clover", which is already integrated and done by a 10 liner build.xml task? I see no real difference and the reports look much more terse than clover.

          Show
          Uwe Schindler added a comment - What is the difference to run "ant clover", which is already integrated and done by a 10 liner build.xml task? I see no real difference and the reports look much more terse than clover.
          Hide
          Dawid Weiss added a comment -

          It is mutation testing overlaid on coverage testing. I have some doubts about mutation testing but I don't have any practical experience with it so I'll just sit and watch

          Show
          Dawid Weiss added a comment - It is mutation testing overlaid on coverage testing. I have some doubts about mutation testing but I don't have any practical experience with it so I'll just sit and watch
          Hide
          Uwe Schindler added a comment -

          I agree with adding this if it helps to find ineffective tests, but can 't this be done like clover without downloading the JAR files into the lib folder. License files are not needed as it's a build tool only and we dont ship with those licenses. So I would opt to do a simple transitive ivy:cachepath to build pit's classpath.

          Show
          Uwe Schindler added a comment - I agree with adding this if it helps to find ineffective tests, but can 't this be done like clover without downloading the JAR files into the lib folder. License files are not needed as it's a build tool only and we dont ship with those licenses. So I would opt to do a simple transitive ivy:cachepath to build pit's classpath.
          Hide
          Uwe Schindler added a comment -

          Hi,
          I looked into the reports (somehow the report for Lucene "core" is missing?): My problem with pitest is the following (and that really makes me afraid, so I will not run it on my jenkins server at all!): It manipulates the codeby adding random modifications to the Lucene bytecode. As Lucene is a heavy file-system related stuff, it can happen that the code removes some code part or changes the parameters of a method call (it also does this!), so it could suddenly delete or modify files outside the working copy (e.g. in our heavy crazy code deleting index files). Damage on the O/S can be prevented by running the tests as a separate user, but it can still crash and corrupt my whole Jenkins installation.

          Show
          Uwe Schindler added a comment - Hi, I looked into the reports (somehow the report for Lucene "core" is missing?): My problem with pitest is the following (and that really makes me afraid, so I will not run it on my jenkins server at all!): It manipulates the codeby adding random modifications to the Lucene bytecode. As Lucene is a heavy file-system related stuff, it can happen that the code removes some code part or changes the parameters of a method call (it also does this!), so it could suddenly delete or modify files outside the working copy (e.g. in our heavy crazy code deleting index files). Damage on the O/S can be prevented by running the tests as a separate user, but it can still crash and corrupt my whole Jenkins installation.
          Hide
          Uwe Schindler added a comment -

          Another thing to take into account: Mutation testing runs the test 2 times, one time without mutations, one time with. For this to work correctly, both test runs must have the same tests.seed value!

          Show
          Uwe Schindler added a comment - Another thing to take into account: Mutation testing runs the test 2 times, one time without mutations, one time with. For this to work correctly, both test runs must have the same tests.seed value!
          Hide
          Robert Muir added a comment -

          Uwe I agree there is some risk, but I think we should set it up in the build to get started
          (maybe someone volunteers to run it in a sandbox'ed jenkins once a week, or whatever).

          it doesnt hurt anything to set it up in build.xml: though I agree we should instead use
          an ivy:cachepath instead of introducing so many third party dependencies for a task/tool
          that our actual codebase doesn't rely on.

          I also agree tests should somehow be rerun with the same seed thru this thing. maybe the
          ant task for this can just generate a random seed itself, and pass that with a -D.

          eventually once someone gets it going, i'm sure some tuning will take place, e.g. it should
          be necessary to set parameters to ignore MMapDirectory etc (wrong file to use mutation testing with, sorry).

          Show
          Robert Muir added a comment - Uwe I agree there is some risk, but I think we should set it up in the build to get started (maybe someone volunteers to run it in a sandbox'ed jenkins once a week, or whatever). it doesnt hurt anything to set it up in build.xml: though I agree we should instead use an ivy:cachepath instead of introducing so many third party dependencies for a task/tool that our actual codebase doesn't rely on. I also agree tests should somehow be rerun with the same seed thru this thing. maybe the ant task for this can just generate a random seed itself, and pass that with a -D. eventually once someone gets it going, i'm sure some tuning will take place, e.g. it should be necessary to set parameters to ignore MMapDirectory etc (wrong file to use mutation testing with, sorry).
          Hide
          Robert Muir added a comment -

          License files are not needed as it's a build tool only and we dont ship with those licenses. So I would opt to do a simple transitive ivy:cachepath to build pit's classpath.

          This also applies to the asm jar (asm-debug-all-4.0.jar.sha1) !

          Show
          Robert Muir added a comment - License files are not needed as it's a build tool only and we dont ship with those licenses. So I would opt to do a simple transitive ivy:cachepath to build pit's classpath. This also applies to the asm jar (asm-debug-all-4.0.jar.sha1) !
          Hide
          Uwe Schindler added a comment -

          This also applies to the asm jar (asm-debug-all-4.0.jar.sha1) !

          I partially agree here, but we actually compile our own code (the ANT TASK) against it. This is different from using the JAR file. Also we need the JAR file in our lib-structure to make development of the ANT TASK possible with eclipse & co.

          Show
          Uwe Schindler added a comment - This also applies to the asm jar (asm-debug-all-4.0.jar.sha1) ! I partially agree here, but we actually compile our own code (the ANT TASK) against it. This is different from using the JAR file. Also we need the JAR file in our lib-structure to make development of the ANT TASK possible with eclipse & co.
          Hide
          Uwe Schindler added a comment -

          I also agree tests should somehow be rerun with the same seed thru this thing. maybe the ant task for this can just generate a random seed itself, and pass that with a -D.

          Thats the only way to go. A good old Javascript-<script/> inside ant can do this. I was thinking about the same for my Jenkins infrastructure. Because currenty you cannot tell my SDDS Jenkins instance to repeat the same test-run. I was thinking about a similar thing (parametrizable build that passes a random master seed to "ant test"). The Groovy code in Jenkins doing the JDK randomization will do this. I just had no time, but it is on my todo list.

          it doesnt hurt anything to set it up in build.xml: though I agree we should instead use an ivy:cachepath instead of introducing so many third party dependencies for a task/tool that our actual codebase doesn't rely on.

          That is what I am optimg for. The extra test-framework/ivy.xml additions should not be there and the cachepath used directly "inline mode". -> revert test-framework/ivy.xml -> add dependency inline to ivy:cachepatch or use a separate pitest-ivy.xml referenced from cachepath only (not resolve).

          Uwe I agree there is some risk, but I think we should set it up in the build to get started (maybe someone volunteers to run it in a sandbox'ed jenkins once a week, or whatever).

          I would take care of a sandbox. The windows tests on SDDS Jenkins are running in a VirtualBOX. The Jenkins virtualBOX plugin has some options about starting/shutting down engines. I would create a minimal Linux VBOX instance (32 bit, few ram to run tests or like that) and make a virtual harddisk snapshot. Whenever the pitest runs weekly, Jenkins starts a new instance using the saved snapshot (which is plain empty clean), runs pitests and then shuts it down again, loosing all changed data on the virtual disk.

          Show
          Uwe Schindler added a comment - I also agree tests should somehow be rerun with the same seed thru this thing. maybe the ant task for this can just generate a random seed itself, and pass that with a -D. Thats the only way to go. A good old Javascript-<script/> inside ant can do this. I was thinking about the same for my Jenkins infrastructure. Because currenty you cannot tell my SDDS Jenkins instance to repeat the same test-run. I was thinking about a similar thing (parametrizable build that passes a random master seed to "ant test"). The Groovy code in Jenkins doing the JDK randomization will do this. I just had no time, but it is on my todo list. it doesnt hurt anything to set it up in build.xml: though I agree we should instead use an ivy:cachepath instead of introducing so many third party dependencies for a task/tool that our actual codebase doesn't rely on. That is what I am optimg for. The extra test-framework/ivy.xml additions should not be there and the cachepath used directly "inline mode". -> revert test-framework/ivy.xml -> add dependency inline to ivy:cachepatch or use a separate pitest-ivy.xml referenced from cachepath only (not resolve). Uwe I agree there is some risk, but I think we should set it up in the build to get started (maybe someone volunteers to run it in a sandbox'ed jenkins once a week, or whatever). I would take care of a sandbox. The windows tests on SDDS Jenkins are running in a VirtualBOX. The Jenkins virtualBOX plugin has some options about starting/shutting down engines. I would create a minimal Linux VBOX instance (32 bit, few ram to run tests or like that) and make a virtual harddisk snapshot. Whenever the pitest runs weekly, Jenkins starts a new instance using the saved snapshot (which is plain empty clean), runs pitests and then shuts it down again, loosing all changed data on the virtual disk.
          Hide
          Greg Bowyer added a comment - - edited

          Wow lot of interest .....

          I will try to answer some of the salient points

          Core was missing until today as one test (TestLuceneConstantVersion) didn't run correctly as it was lacking the Lucene version system property. Currently pit refuses to run unless the underlying suite is all green (a good thing IMHO) so I didn't have core from my first run (its there now).

          This takes a long time to run, all of the ancillary Lucene packages take roughly 4 hours to run on the largest CPU ec2 instance, core takes 8 hours (this was the other reason core was missing, I was waiting for it to finish crunching)

          As to the random seed, I completely agree and it was one of the things I mentioned on the mailing list that makes the output of this tool not perfect. I do feel that the tests that are randomised typically do a better job at gaining coverage, but its a good idea to stabilise the seed.

          Jars and build.xml, I have no problems changing this to whatever people think fits best into the build. My impression was that clover is handled the way it is because it is not technically opensource and as a result has screwball licensing concerns, essentially I didn't know any better :S I will try to get a chance to make it use the ivy:cachepath approach.

          Regarding the risks posed by mutations, I cannot prove or say there are no risks; however mutation testing is not random in the mutations applied, they are formulaic and quite simple. It will not permute arguments nor will it mutate complex objects (it can and does mess with object references turning references in arguments to nulls). I can conceive of ways in which it could screwup mutated code making it possible to delete random files but I don't think they are going to be extremely likely situations. FWIW I would be less worried about this deleting something on the filesystem and far more worried about it accidentally leaving corpses of undeleted files.

          Sandboxing it could solve that issue, if that is too much effort another approach might be to work with the pitest team and build a security manager that is militant about file access, disallowing anything that canonicalises outside of a given path.

          Oh and as Robert suggested we can always point it away from key things.

          At the end of the day its a tool like any other, I have exactly the same feelings as Robert on this

          This is cool. I'd say lets get it up and going on jenkins (even weekly
          or something). why worry about the imperfections in any of these
          coverage tools, whats way more important is when the results find
          situations where you thought you were testing something, but really
          arent, etc (here was a recent one found by clover
          http://svn.apache.org/viewvc?rev=1376722&view=rev).

          so imo just another tool to be able to identify serious gaps/test-bugs
          after things are up and running. and especially looking at deltas from
          line coverage to identify stuff thats 'executing' but not actually
          being tested.

          Show
          Greg Bowyer added a comment - - edited Wow lot of interest ..... I will try to answer some of the salient points Core was missing until today as one test (TestLuceneConstantVersion) didn't run correctly as it was lacking the Lucene version system property. Currently pit refuses to run unless the underlying suite is all green (a good thing IMHO) so I didn't have core from my first run (its there now). This takes a long time to run, all of the ancillary Lucene packages take roughly 4 hours to run on the largest CPU ec2 instance, core takes 8 hours (this was the other reason core was missing, I was waiting for it to finish crunching) As to the random seed, I completely agree and it was one of the things I mentioned on the mailing list that makes the output of this tool not perfect. I do feel that the tests that are randomised typically do a better job at gaining coverage, but its a good idea to stabilise the seed. Jars and build.xml, I have no problems changing this to whatever people think fits best into the build. My impression was that clover is handled the way it is because it is not technically opensource and as a result has screwball licensing concerns, essentially I didn't know any better :S I will try to get a chance to make it use the ivy:cachepath approach. Regarding the risks posed by mutations, I cannot prove or say there are no risks; however mutation testing is not random in the mutations applied, they are formulaic and quite simple. It will not permute arguments nor will it mutate complex objects (it can and does mess with object references turning references in arguments to nulls). I can conceive of ways in which it could screwup mutated code making it possible to delete random files but I don't think they are going to be extremely likely situations. FWIW I would be less worried about this deleting something on the filesystem and far more worried about it accidentally leaving corpses of undeleted files. Sandboxing it could solve that issue, if that is too much effort another approach might be to work with the pitest team and build a security manager that is militant about file access, disallowing anything that canonicalises outside of a given path. Oh and as Robert suggested we can always point it away from key things. At the end of the day its a tool like any other, I have exactly the same feelings as Robert on this This is cool. I'd say lets get it up and going on jenkins (even weekly or something). why worry about the imperfections in any of these coverage tools, whats way more important is when the results find situations where you thought you were testing something, but really arent, etc (here was a recent one found by clover http://svn.apache.org/viewvc?rev=1376722&view=rev ). so imo just another tool to be able to identify serious gaps/test-bugs after things are up and running. and especially looking at deltas from line coverage to identify stuff thats 'executing' but not actually being tested.
          Hide
          Hoss Man added a comment -

          I had never heard of this technique until Greg mentioned it on the list, but the key thing that really impresses me about it is that (from what i can see) it can help us find code whose behavior does not affect the output of the test – this is something no "code coverage" tool like clover and emma can do.

          Clover is great for reporting that when the tests are run, method "bar()" is executed 200 times but "foo()" is never executed at all, but that doesn't tell us anything about whether the success of a test is actually dependent on the results of "bar()" being correct. With these kinds of mutation testing, we will be able to see reports that say "bar() was executed 200 times, but when i munged the result of of bar() it didn't cause any tests to fail" which could be a big help for identifying tests where we assert the results of method calls, but forget to assert the "side effects" of those calls.

          Show
          Hoss Man added a comment - I had never heard of this technique until Greg mentioned it on the list, but the key thing that really impresses me about it is that (from what i can see) it can help us find code whose behavior does not affect the output of the test – this is something no "code coverage" tool like clover and emma can do. Clover is great for reporting that when the tests are run, method "bar()" is executed 200 times but "foo()" is never executed at all, but that doesn't tell us anything about whether the success of a test is actually dependent on the results of "bar()" being correct. With these kinds of mutation testing, we will be able to see reports that say "bar() was executed 200 times, but when i munged the result of of bar() it didn't cause any tests to fail" which could be a big help for identifying tests where we assert the results of method calls, but forget to assert the "side effects" of those calls.
          Hide
          Uwe Schindler added a comment - - edited

          As to the random seed, I completely agree and it was one of the things I mentioned on the mailing list that makes the output of this tool not perfect. I do feel that the tests that are randomised typically do a better job at gaining coverage, but its a good idea to stabilise the seed.

          About the random seed: This is very important for the functionality. Pitest runs the test initially and measures the time it takes. After it patched and inserted bugs into bytecode it runs the test again. The problem is: If you cahnge e.g. a loop constraint from < to >= (one of the mutations), it may happen that suddenly the modified code hangs in an endless loop. Because of that, Pitest stops the test once it takes longer than something like 1.5x of time without modification. This of course needs the original test time to be predictable. If random seed changes, its often hapopens for Lucene tests that it takes 10 times longer than with another seed (e.g. when SimpleText codec is used). In that case Pitest would kill the test although it may fail correctly as expected (maybe only at later time).

          Also with random seeds it may happen that it uses completely different code paths. So with not using same test seed for both runs the results caused by changes in Codecs make the report wrong (I have seen this in some test when inspecting the report).

          So I would set -Dtests.seed to a random number, e.g. by adding this to build.xml before running the test:

          <script language="javascript"><![CDATA[
          if (project.getProperty("tests.seed") == default value|null) {
            var seed = abs(new Random().nextLong()); // must be positive
            var encodedSeed = toFacyDawidWeissSeedFormat(seed);
            project.setProperty("tests.seed", encodedSeed);
          }
          ]]></script>
          

          and build a security manager that is militant about file access, disallowing anything that canonicalises outside of a given path

          Thats a cool idea also for our own tests! We should install a SecurityManager always and only allow files in build/test. LuceneTestCase can enforce this SecurityManager installed! And if a test writes outside, fail it!

          Show
          Uwe Schindler added a comment - - edited As to the random seed, I completely agree and it was one of the things I mentioned on the mailing list that makes the output of this tool not perfect. I do feel that the tests that are randomised typically do a better job at gaining coverage, but its a good idea to stabilise the seed. About the random seed: This is very important for the functionality. Pitest runs the test initially and measures the time it takes. After it patched and inserted bugs into bytecode it runs the test again. The problem is: If you cahnge e.g. a loop constraint from < to >= (one of the mutations), it may happen that suddenly the modified code hangs in an endless loop. Because of that, Pitest stops the test once it takes longer than something like 1.5x of time without modification. This of course needs the original test time to be predictable. If random seed changes, its often hapopens for Lucene tests that it takes 10 times longer than with another seed (e.g. when SimpleText codec is used). In that case Pitest would kill the test although it may fail correctly as expected (maybe only at later time). Also with random seeds it may happen that it uses completely different code paths. So with not using same test seed for both runs the results caused by changes in Codecs make the report wrong (I have seen this in some test when inspecting the report). So I would set -Dtests.seed to a random number, e.g. by adding this to build.xml before running the test: <script language= "javascript" > <![CDATA[ if (project.getProperty( "tests.seed" ) == default value|null) { var seed = abs(new Random().nextLong()); // must be positive var encodedSeed = toFacyDawidWeissSeedFormat(seed); project.setProperty( "tests.seed" , encodedSeed); } ]]> </script> and build a security manager that is militant about file access, disallowing anything that canonicalises outside of a given path Thats a cool idea also for our own tests! We should install a SecurityManager always and only allow files in build/test. LuceneTestCase can enforce this SecurityManager installed! And if a test writes outside, fail it!
          Hide
          Greg Bowyer added a comment -

          Thats a cool idea also for our own tests! We should install a SecurityManager always and only allow files in build/test. LuceneTestCase can enforce this SecurityManager installed! And if a test writes outside, fail it!

          Should we split out that as a separate thing and get a security manager built that hooks into the awesome carrot testing stuffs

          Show
          Greg Bowyer added a comment - Thats a cool idea also for our own tests! We should install a SecurityManager always and only allow files in build/test. LuceneTestCase can enforce this SecurityManager installed! And if a test writes outside, fail it! Should we split out that as a separate thing and get a security manager built that hooks into the awesome carrot testing stuffs
          Hide
          Robert Muir added a comment -

          i think a security manager would be useful for that reason separately.

          also we should ask Dawid if each forked jvm can get its own sandbox'ed java.io.tmpdir in build/test: this would only prevent problems
          and would be nice for various external libs that might write to java.io.tmpdir or whatever.

          Show
          Robert Muir added a comment - i think a security manager would be useful for that reason separately. also we should ask Dawid if each forked jvm can get its own sandbox'ed java.io.tmpdir in build/test: this would only prevent problems and would be nice for various external libs that might write to java.io.tmpdir or whatever.
          Hide
          Greg Bowyer added a comment - - edited

          Following up it turns out to be very simple to do the security manager trick

          import java.io.File;
          
          public class Test {
          
              public static void main(String... args) {
                  System.setSecurityManager(new SecurityManager() {
                      public void checkDelete(String file) throws SecurityException {
                          File fp = new File(file);
                          String path = fp.getAbsolutePath();
          
                          if (!path.startsWith("/tmp")) {
                              throw new SecurityException("Bang!");
                          }
                      }
                  });
          
                  new File("/home/greg/test").delete();
              }
          }
          
          Exception in thread "main" java.lang.SecurityException: Bang!
          	at Test$1.checkDelete(Test.java:12)
          	at java.io.File.delete(File.java:971)
          	at Test.main(Test.java:17)
          

          There is a lot of scope here if you want to abuse checking for all sorts of things (files, sockets, threads etc)

          Show
          Greg Bowyer added a comment - - edited Following up it turns out to be very simple to do the security manager trick import java.io.File; public class Test { public static void main( String ... args) { System .setSecurityManager( new SecurityManager () { public void checkDelete( String file) throws SecurityException { File fp = new File(file); String path = fp.getAbsolutePath(); if (!path.startsWith( "/tmp" )) { throw new SecurityException( "Bang!" ); } } }); new File( "/home/greg/test" ).delete(); } } Exception in thread "main" java.lang.SecurityException: Bang! at Test$1.checkDelete(Test.java:12) at java.io.File.delete(File.java:971) at Test.main(Test.java:17) There is a lot of scope here if you want to abuse checking for all sorts of things (files, sockets, threads etc)
          Hide
          Dawid Weiss added a comment -

          toFacyDawidWeissSeedFormat

          That's just a hex representation of eight bytes (unsigned long), nothing fancy.

          We should install a SecurityManager always and only allow files in build/test

          I didn't cater for the presence of a security manager in the runner and it will probably break things in the runner that will be tough to debug. Just a fair warning. You will probably have to give the runner a policy of being able to do everything and it still may fail to run.

          also we should ask Dawid if each forked jvm can get its own sandbox'ed java.io.tmpdir in build/test:

          They already do I think because tmpdir property is overridden with "." and cwd is set to J0/J1/J2/JN under the test dir.

          Show
          Dawid Weiss added a comment - toFacyDawidWeissSeedFormat That's just a hex representation of eight bytes (unsigned long), nothing fancy. We should install a SecurityManager always and only allow files in build/test I didn't cater for the presence of a security manager in the runner and it will probably break things in the runner that will be tough to debug. Just a fair warning. You will probably have to give the runner a policy of being able to do everything and it still may fail to run. also we should ask Dawid if each forked jvm can get its own sandbox'ed java.io.tmpdir in build/test: They already do I think because tmpdir property is overridden with "." and cwd is set to J0/J1/J2/JN under the test dir.
          Hide
          Robert Muir added a comment -

          They already do I think because tmpdir property is overridden with "." and cwd is set to J0/J1/J2/JN under the test dir.

          OK, cool: its something we never did before that I always thought should be done...

          Show
          Robert Muir added a comment - They already do I think because tmpdir property is overridden with "." and cwd is set to J0/J1/J2/JN under the test dir. OK, cool: its something we never did before that I always thought should be done...
          Greg Bowyer made changes -
          Link This issue incorporates LUCENE-4337 [ LUCENE-4337 ]
          Hide
          Dawid Weiss added a comment -

          OK, cool: its something we never did before that I always thought should be done...

          Yes, I checked. Don't let this confuse you:

          <junit4:junit4
                      dir="@{tempDir}"
                      tempdir="@{tempDir}" 
          

          the "tempdir" is a temporary folder for all the extra files the runner needs (spills, event files, etc.), forked JVMs get their own subfolder. See the "isolateWorkingDirectories" parameter here; it's true by default so omitted:

          http://labs.carrotsearch.com/download/randomizedtesting/2.0.0-SNAPSHOT/docs/junit4-ant/Tasks/junit4.html

          I looked at the common-build.xml file though and I see only this:

                      <!-- Temporary directory in the cwd. -->
                      <sysproperty key="tempDir" value="."/> 
          

          I was wrong then, the default 'java.io.tmpdir' is not overriden here and I think it should. I wrote a small test asking for a File.createTempFile and it did use a global temp (not good).

          Show
          Dawid Weiss added a comment - OK, cool: its something we never did before that I always thought should be done... Yes, I checked. Don't let this confuse you: <junit4:junit4 dir= "@{tempDir}" tempdir= "@{tempDir}" the "tempdir" is a temporary folder for all the extra files the runner needs (spills, event files, etc.), forked JVMs get their own subfolder. See the "isolateWorkingDirectories" parameter here; it's true by default so omitted: http://labs.carrotsearch.com/download/randomizedtesting/2.0.0-SNAPSHOT/docs/junit4-ant/Tasks/junit4.html I looked at the common-build.xml file though and I see only this: <!-- Temporary directory in the cwd. --> <sysproperty key= "tempDir" value= "." /> I was wrong then, the default 'java.io.tmpdir' is not overriden here and I think it should. I wrote a small test asking for a File.createTempFile and it did use a global temp (not good).
          Hide
          Robert Muir added a comment -

          I didn't cater for the presence of a security manager in the runner and it will probably break things in the runner that will be tough to debug. Just a fair warning. You will probably have to give the runner a policy of being able to do everything and it still may fail to run.

          Well if its for test purposes only and not enforcing actual security, we should be able to give the runner a nice backdoor (e.g. static boolean BACKDOORED) if
          we really need to right?

          Show
          Robert Muir added a comment - I didn't cater for the presence of a security manager in the runner and it will probably break things in the runner that will be tough to debug. Just a fair warning. You will probably have to give the runner a policy of being able to do everything and it still may fail to run. Well if its for test purposes only and not enforcing actual security, we should be able to give the runner a nice backdoor (e.g. static boolean BACKDOORED) if we really need to right?
          Hide
          Dawid Weiss added a comment -

          As for generating a common seed for all runs – if you slurp junit4 taskdefs at the main level somewhere then you can do:

                  <!-- Pick the random seed now (unless already set). -->
                  <junit4:pickseed property="tests.seed" /> 
          

          and as long as this property is passed to subants it will remain the same. But you can just as well just generate it, even from the shell level:

          openssl rand -hex 8
          
          Show
          Dawid Weiss added a comment - As for generating a common seed for all runs – if you slurp junit4 taskdefs at the main level somewhere then you can do: <!-- Pick the random seed now (unless already set). --> <junit4:pickseed property= "tests.seed" /> and as long as this property is passed to subants it will remain the same. But you can just as well just generate it, even from the shell level: openssl rand -hex 8
          Hide
          Dawid Weiss added a comment -

          Well if its for test purposes only and not enforcing actual security, we should be able to give the runner a nice backdoor (e.g. static boolean BACKDOORED) if we really need to right?

          I don't know what it will do and how it will fail. I never run with a security manager, I think the design of SM is complex, weird and poorly documented (as evidenced by the security vuln. published today...). I just don't think of it much because I know very few people that actually run with a SM enabled. I am fairly sure I do many things in the code that I know won't work with an installed SM – scanning properties, opening files, etc. I'm sure it is possible to configure it do grant permissions to certain packages but I've no idea how to do it.

          I'm not saying no to the idea, but I'm just letting you know that things may break in unexpected ways and I will have limited time to learn SM internals... Not that I'm especially looking forward to that.

          Show
          Dawid Weiss added a comment - Well if its for test purposes only and not enforcing actual security, we should be able to give the runner a nice backdoor (e.g. static boolean BACKDOORED) if we really need to right? I don't know what it will do and how it will fail. I never run with a security manager, I think the design of SM is complex, weird and poorly documented (as evidenced by the security vuln. published today...). I just don't think of it much because I know very few people that actually run with a SM enabled. I am fairly sure I do many things in the code that I know won't work with an installed SM – scanning properties, opening files, etc. I'm sure it is possible to configure it do grant permissions to certain packages but I've no idea how to do it. I'm not saying no to the idea, but I'm just letting you know that things may break in unexpected ways and I will have limited time to learn SM internals... Not that I'm especially looking forward to that.
          Hide
          Robert Muir added a comment -

          I agree with you, i dont know anything about security managers either. But it seems like we could use such a thing to find test bugs.

          Of course our security manager would have vulnerabilities (possibly introduced intentionally in case we need to backdoor it). But this
          is more like locking your front door so it won't blow open when its windy outside.

          Show
          Robert Muir added a comment - I agree with you, i dont know anything about security managers either. But it seems like we could use such a thing to find test bugs. Of course our security manager would have vulnerabilities (possibly introduced intentionally in case we need to backdoor it). But this is more like locking your front door so it won't blow open when its windy outside.
          Hide
          Uwe Schindler added a comment -

          You would need a simple Security manager that allows all, except creating/deleting some files outside a working directory. We can check this out (by enabling a noop security manager), if it then still passes we are fine.

          Then we can go forward. The security manager would only be activated by LTC before after a test class and then disabled again. I would not restrict file access too much. For Pitest, I would only disallow everything outside working directory root and later disable more. I think a simple restriction to build/ dir would also help to prevent solr creating files in test-files src folder.

          An all-allowing security manager is easy, template is available to extend. The problem you are talking about are too complex security restrictions dictated by J2EE (that limit things like creating classes or classloader) - we dont want to do this, we only want a hook into (new FileOutputStream) and want to throw an Exception on wrong path. If you allow all other security manager requests, there is no issue.

          Show
          Uwe Schindler added a comment - You would need a simple Security manager that allows all, except creating/deleting some files outside a working directory. We can check this out (by enabling a noop security manager), if it then still passes we are fine. Then we can go forward. The security manager would only be activated by LTC before after a test class and then disabled again. I would not restrict file access too much. For Pitest, I would only disallow everything outside working directory root and later disable more. I think a simple restriction to build/ dir would also help to prevent solr creating files in test-files src folder. An all-allowing security manager is easy, template is available to extend. The problem you are talking about are too complex security restrictions dictated by J2EE (that limit things like creating classes or classloader) - we dont want to do this, we only want a hook into (new FileOutputStream) and want to throw an Exception on wrong path. If you allow all other security manager requests, there is no issue.
          Hide
          Greg Bowyer added a comment -

          I can codify a security manager, they are somewhat complex but I see our needs here as very simple (essentially assert file paths)

          Show
          Greg Bowyer added a comment - I can codify a security manager, they are somewhat complex but I see our needs here as very simple (essentially assert file paths)
          Hide
          Dawid Weiss added a comment -

          Sure, all I'm saying is that I know very little about SMs so if something breaks I'll just stare at you blindly

          Show
          Dawid Weiss added a comment - Sure, all I'm saying is that I know very little about SMs so if something breaks I'll just stare at you blindly
          Hide
          Greg Bowyer added a comment -

          New version

          • uses ivy:cachepath as suggested
          • removed random license files
          • leverages security manager (LUCENE-4337)
          • uses fixed random seed
          Show
          Greg Bowyer added a comment - New version uses ivy:cachepath as suggested removed random license files leverages security manager ( LUCENE-4337 ) uses fixed random seed
          Greg Bowyer made changes -
          Hide
          Dawid Weiss added a comment -

          Greg, what's this responsible for:

          +  <property name="pitest.threads" value="2" />
          

          Is this test execution concurrency at JVM level? If so, it needs to be set to "1" because tests won't support concurrent suites (too many static stuff lying around).

          +        <!-- Java7 has a new bytecode verifier that _requires_ stackmaps to be present in the
          +             bytecode. Unfortunately a *lot* of bytecode manipulation tools (including pitest)
          +             do not currently write out this information; for now we are forced to disable this
          +             in the jvm args -->
          

          Both proguard and asmlib produce these I think. Just a note, I know it's probably not possible to integrate in pi's toolchain without changes to its source code.

          +        <script language="javascript"><![CDATA[
          

          Will this require people to put rhino in ant's lib/ folder? If so, I'd just use an ugly property value="-D..=${} -D..=${}..". Yours is very nice but an additional step is required which seems an overkill?

          Show
          Dawid Weiss added a comment - Greg, what's this responsible for: + <property name= "pitest.threads" value= "2" /> Is this test execution concurrency at JVM level? If so, it needs to be set to "1" because tests won't support concurrent suites (too many static stuff lying around). + <!-- Java7 has a new bytecode verifier that _requires_ stackmaps to be present in the + bytecode. Unfortunately a *lot* of bytecode manipulation tools (including pitest) + do not currently write out this information; for now we are forced to disable this + in the jvm args --> Both proguard and asmlib produce these I think. Just a note, I know it's probably not possible to integrate in pi's toolchain without changes to its source code. + <script language= "javascript" ><![CDATA[ Will this require people to put rhino in ant's lib/ folder? If so, I'd just use an ugly property value="-D..=${} -D..=${}..". Yours is very nice but an additional step is required which seems an overkill?
          Hide
          Greg Bowyer added a comment -

          Greg, what's this responsible for:

          +  <property name="pitest.threads" value="2" />
          Is this test execution concurrency at JVM level? If so, it needs to be set to "1" because tests won't support concurrent suites (too many static stuff lying around).
          

          Its a number of runner vms used by pitest (I think) so in essence it should spawn two vms to run test through (but let me check I might have that wrong)

          +        <!-- Java7 has a new bytecode verifier that _requires_ stackmaps to be present in the
          +             bytecode. Unfortunately a *lot* of bytecode manipulation tools (including pitest)
          +             do not currently write out this information; for now we are forced to disable this
          +             in the jvm args -->
          

          Both proguard and asmlib produce these I think. Just a note, I know it's probably not possible to integrate in pi's toolchain without changes to its source code.

          Humm weird I thought I had problems with the verifier but now that I just tried it, it worked - Guess I will be taking that out then

          Will this require people to put rhino in ant's lib/ folder? If so, I'd just use an ugly property value="-D..=${} -D..=${}..". Yours is very nice but an additional step is required which seems an overkill?

          Might do, depends on the JVM vendor / version - I will change it to the ugly

          Show
          Greg Bowyer added a comment - Greg, what's this responsible for: + <property name= "pitest.threads" value= "2" /> Is this test execution concurrency at JVM level? If so, it needs to be set to "1" because tests won't support concurrent suites (too many static stuff lying around). Its a number of runner vms used by pitest (I think) so in essence it should spawn two vms to run test through (but let me check I might have that wrong) + <!-- Java7 has a new bytecode verifier that _requires_ stackmaps to be present in the + bytecode. Unfortunately a *lot* of bytecode manipulation tools (including pitest) + do not currently write out this information; for now we are forced to disable this + in the jvm args --> Both proguard and asmlib produce these I think. Just a note, I know it's probably not possible to integrate in pi's toolchain without changes to its source code. Humm weird I thought I had problems with the verifier but now that I just tried it, it worked - Guess I will be taking that out then Will this require people to put rhino in ant's lib/ folder? If so, I'd just use an ugly property value="-D..=${} -D..=${}..". Yours is very nice but an additional step is required which seems an overkill? Might do, depends on the JVM vendor / version - I will change it to the ugly
          Hide
          Dawid Weiss added a comment -

          Its a number of runner vms used by pitest (I think)

          If so then we're probably fine because that's what we do with ant-junit4 task too, thanks.

          Show
          Dawid Weiss added a comment - Its a number of runner vms used by pitest (I think) If so then we're probably fine because that's what we do with ant-junit4 task too, thanks.
          Hide
          Uwe Schindler added a comment - - edited

          Will this require people to put rhino in ant's lib/ folder? If so, I'd just use an ugly property value="-D..=${} -D..=${}..". Yours is very nice but an additional step is required which seems an overkill?

          We already have a lot of script tasks in our build files (e.g. generating website using ). Javascript is shipped with every known JVM. - And pitest is only run by few people, much more call tasks like "root/lucene$> ant documentation" or "root$> ant svn-check-working-copy", so this is no issue here.

          Show
          Uwe Schindler added a comment - - edited Will this require people to put rhino in ant's lib/ folder? If so, I'd just use an ugly property value="-D..=${} -D..=${}..". Yours is very nice but an additional step is required which seems an overkill? We already have a lot of script tasks in our build files (e.g. generating website using ). Javascript is shipped with every known JVM. - And pitest is only run by few people, much more call tasks like "root/lucene$> ant documentation" or "root$> ant svn-check-working-copy", so this is no issue here.
          Hide
          Dawid Weiss added a comment -

          Javascript is shipped with every known JVM

          Umm, really? I didn't know about it, I thought a separate download is needed. I know the scripting API is part of JRE but is an implementation also embedded?

          Show
          Dawid Weiss added a comment - Javascript is shipped with every known JVM Umm, really? I didn't know about it, I thought a separate download is needed. I know the scripting API is part of JRE but is an implementation also embedded?
          Hide
          Dawid Weiss added a comment -

          Indeed, just checked, sorry for misdirecting, Greg, I didn't know js impl. is part of the default platform.

          Show
          Dawid Weiss added a comment - Indeed, just checked, sorry for misdirecting, Greg, I didn't know js impl. is part of the default platform.
          Hide
          Uwe Schindler added a comment -

          Sun/Oracle, IBM J9 and jRockit contain it. Only the FreeBSD one is missing it because of Licensing reasons, but thats a well-known bug to be fixed soon (so on ASF Jenkins its installed as separate package in JRE_PATH/lib/ext" or like that).

          The major drawback of the sjhipped javascript version is that it is trimmed down and produces no bytecode out of Javascript, it interprets only. But that is no issue here.

          The Java 6 specs are a little bit unclear, but they say something like (unfortunately you have to download as ZIP file): A "default" scripting implementation is required to be shipped with any Java 6 JDK, but it is not defined which one. Oracle officially says "javascript (Rhino) is shipped with OpenJDK and JDK", so all other JVM vendors would be stupid to ship with another default one. So, let's assume its there, otherwise you have a problem building the release package of Lucene. It is similar to the case that JDK contains implementations for XML processing, but the language spec only enforce the API to be there and SPI to load them. Which XML processor is shipped depends on developer, but there must be one.

          Another (bloated) version of doing that is using ivy:cachepath to download and add a classpathref= to the <script/> element, but then you will at some time hit the stupid ANT permgen issue (if you have many <script/> elements).

          Show
          Uwe Schindler added a comment - Sun/Oracle, IBM J9 and jRockit contain it. Only the FreeBSD one is missing it because of Licensing reasons, but thats a well-known bug to be fixed soon (so on ASF Jenkins its installed as separate package in JRE_PATH/lib/ext" or like that). The major drawback of the sjhipped javascript version is that it is trimmed down and produces no bytecode out of Javascript, it interprets only. But that is no issue here. The Java 6 specs are a little bit unclear, but they say something like (unfortunately you have to download as ZIP file): A "default" scripting implementation is required to be shipped with any Java 6 JDK, but it is not defined which one. Oracle officially says "javascript (Rhino) is shipped with OpenJDK and JDK", so all other JVM vendors would be stupid to ship with another default one. So, let's assume its there, otherwise you have a problem building the release package of Lucene. It is similar to the case that JDK contains implementations for XML processing, but the language spec only enforce the API to be there and SPI to load them. Which XML processor is shipped depends on developer, but there must be one. Another (bloated) version of doing that is using ivy:cachepath to download and add a classpathref= to the <script/> element, but then you will at some time hit the stupid ANT permgen issue (if you have many <script/> elements).
          Hide
          Uwe Schindler added a comment -

          junit4:pickseed is cool. I did not know that this is available to ant! Thanks Dawid!

          Show
          Uwe Schindler added a comment - junit4:pickseed is cool. I did not know that this is available to ant! Thanks Dawid!
          Hide
          Greg Bowyer added a comment -

          I found and fixed a few things relating to security policies (ha!)

          Also I thought about the javascript a bit, I know thats its standard in the JVM but when I tested on a 1.6 I found that my javascript was a bit too advanced for it.

          That made me pause for thought and realise that I might have been a bit too clever for my own good there; I am not all that happy with the the JS so I went with the ugly approach for now.

          It think the right solution is to work with the pitest people and get it to support junit style nested tags like sysproperty

          Show
          Greg Bowyer added a comment - I found and fixed a few things relating to security policies (ha!) Also I thought about the javascript a bit, I know thats its standard in the JVM but when I tested on a 1.6 I found that my javascript was a bit too advanced for it. That made me pause for thought and realise that I might have been a bit too clever for my own good there; I am not all that happy with the the JS so I went with the ugly approach for now. It think the right solution is to work with the pitest people and get it to support junit style nested tags like sysproperty
          Greg Bowyer made changes -
          Hide
          Dawid Weiss added a comment -

          junit4:pickseed is cool. I did not know that this is available to ant! Thanks Dawid!

          No problem, it was needed to assure reproducibility because we pick file.encoding randomly and it should use the same seed. Had I known about the fact javascript is embedded in every JDK I'd probably use that instead

          Show
          Dawid Weiss added a comment - junit4:pickseed is cool. I did not know that this is available to ant! Thanks Dawid! No problem, it was needed to assure reproducibility because we pick file.encoding randomly and it should use the same seed. Had I known about the fact javascript is embedded in every JDK I'd probably use that instead
          Hide
          Greg Bowyer added a comment -

          Do we want to implement this now that Uwe's changes are in 4.0 / trunk ?

          Show
          Greg Bowyer added a comment - Do we want to implement this now that Uwe's changes are in 4.0 / trunk ?
          Hide
          Robert Muir added a comment -

          +1

          Show
          Robert Muir added a comment - +1
          Hide
          Greg Bowyer added a comment -

          committed to trunk

          Sending build.xml
          Sending lucene/build.xml
          Sending lucene/common-build.xml
          Sending lucene/tools/build.xml
          Sending lucene/tools/junit4/tests.policy
          Sending solr/build.xml
          Sending solr/example/build.xml
          Sending solr/example/example-DIH/build.xml
          Transmitting file data ........
          Committed revision 1380938.

          and branch_4x

          Sending build.xml
          Sending lucene/build.xml
          Sending lucene/common-build.xml
          Sending lucene/tools/build.xml
          Sending lucene/tools/junit4/tests.policy
          Sending solr/build.xml
          Sending solr/example/build.xml
          Sending solr/example/example-DIH/build.xml
          Transmitting file data ........
          Committed revision 1380937.

          Show
          Greg Bowyer added a comment - committed to trunk Sending build.xml Sending lucene/build.xml Sending lucene/common-build.xml Sending lucene/tools/build.xml Sending lucene/tools/junit4/tests.policy Sending solr/build.xml Sending solr/example/build.xml Sending solr/example/example-DIH/build.xml Transmitting file data ........ Committed revision 1380938. and branch_4x Sending build.xml Sending lucene/build.xml Sending lucene/common-build.xml Sending lucene/tools/build.xml Sending lucene/tools/junit4/tests.policy Sending solr/build.xml Sending solr/example/build.xml Sending solr/example/example-DIH/build.xml Transmitting file data ........ Committed revision 1380937.
          Hide
          Greg Bowyer added a comment -

          I guess now we need to figure out how and when to get jenkins to run this .. any thoughts ?

          Show
          Greg Bowyer added a comment - I guess now we need to figure out how and when to get jenkins to run this .. any thoughts ?
          Hide
          Uwe Schindler added a comment -

          Thanks for committing!

          Just curious: why were the additional permissions needed? Because tests should not behave different with pitest. I don't like SecurityPermission now added with *.

          Show
          Uwe Schindler added a comment - Thanks for committing! Just curious: why were the additional permissions needed? Because tests should not behave different with pitest. I don't like SecurityPermission now added with *.
          Hide
          Greg Bowyer added a comment -

          Pitest forks surrogate runner jvms, when these things bootup they do some socket related nonsense for communication (SecurityPermission), and set on thread time monitoring (ManagementPermission)

          SecurityPermission hides out in lots of strange places, one of these is in the DNS cache internal to a JVM.

          We could restrict it, I didn't for ease of configuration since the security manager is aimed at preventing none malicious mistakes (like writing outside of the sandbox) rather than as a full force prevention of malicious code (which would require a bit more thinking through IMHO)

          Show
          Greg Bowyer added a comment - Pitest forks surrogate runner jvms, when these things bootup they do some socket related nonsense for communication (SecurityPermission), and set on thread time monitoring (ManagementPermission) SecurityPermission hides out in lots of strange places, one of these is in the DNS cache internal to a JVM. We could restrict it, I didn't for ease of configuration since the security manager is aimed at preventing none malicious mistakes (like writing outside of the sandbox) rather than as a full force prevention of malicious code (which would require a bit more thinking through IMHO)
          Hide
          Uwe Schindler added a comment - - edited
            permission java.security.SecurityPermission "*", "read,write";
          

          This makes no sense, as SecurityPermission has no "action", so "read,write" should be ignored. I was restricting SecurityPermission with something in mind (see the last 2 lines that allowed only the BouncyCastle installed by TIKA - now everything is allowed). What fails if I remove that line? I have no time to run the whole pitest suite....

          rather than as a full force prevention of malicious code

          The idea was to find places (especially in TIKA) that do things they should not do (like enabling security providers), which makes the configuration of J2EE container hosting Solr hard. So we should limit all this, to see when somebody adds a "new feature" to Solr that needs additional permissions.

          I am already working on restricting "RuntimePermission" more, so only things like "reflection" and "property access" is allowed.

          Show
          Uwe Schindler added a comment - - edited permission java.security.SecurityPermission "*" , "read,write" ; This makes no sense, as SecurityPermission has no "action", so "read,write" should be ignored. I was restricting SecurityPermission with something in mind (see the last 2 lines that allowed only the BouncyCastle installed by TIKA - now everything is allowed). What fails if I remove that line? I have no time to run the whole pitest suite.... rather than as a full force prevention of malicious code The idea was to find places (especially in TIKA) that do things they should not do (like enabling security providers), which makes the configuration of J2EE container hosting Solr hard. So we should limit all this, to see when somebody adds a "new feature" to Solr that needs additional permissions. I am already working on restricting "RuntimePermission" more, so only things like "reflection" and "property access" is allowed.
          Hide
          Greg Bowyer added a comment -

          permission java.security.SecurityPermission "*", "read,write";
          This makes no sense, as SecurityPermission has no "action", so "read,write" should be ignored. I was restricting SecurityPermission with something in mind (see the last 2 lines that allowed only the BouncyCastle installed by TIKA - now everything is allowed). What fails if I remove that line? I have no time to run the whole pitest suite....

          You are right on that, I will change it

          The idea was to find places (especially in TIKA) that do things they should not do (like enabling security providers), which makes the configuration of J2EE container hosting Solr hard. So we should limit all this, to see when somebody adds a "new feature" to Solr that needs additional permissions.
          I am already working on restricting "RuntimePermission" more, so only things like "reflection" and "property access" is allowed.

          Ok the intention changed a fair bit, I was still under the impression that we were targeting making this keep tests in a sandbox rather than helping solr with hosting inside complex J2EE arrangements

          Show
          Greg Bowyer added a comment - permission java.security.SecurityPermission "*", "read,write"; This makes no sense, as SecurityPermission has no "action", so "read,write" should be ignored. I was restricting SecurityPermission with something in mind (see the last 2 lines that allowed only the BouncyCastle installed by TIKA - now everything is allowed). What fails if I remove that line? I have no time to run the whole pitest suite.... You are right on that, I will change it The idea was to find places (especially in TIKA) that do things they should not do (like enabling security providers), which makes the configuration of J2EE container hosting Solr hard. So we should limit all this, to see when somebody adds a "new feature" to Solr that needs additional permissions. I am already working on restricting "RuntimePermission" more, so only things like "reflection" and "property access" is allowed. Ok the intention changed a fair bit, I was still under the impression that we were targeting making this keep tests in a sandbox rather than helping solr with hosting inside complex J2EE arrangements
          Hide
          Greg Bowyer added a comment -

          Ok the security permission stuff is tightened up for just the internal jvm cache

          basically it is as follows

          // Needed for some things in DNS caching in the JVM
          permission java.security.SecurityPermission "getProperty.networkaddress.cache.ttl";
          permission java.security.SecurityPermission "getProperty.networkaddress.cache.negative.ttl";
          

          branch_4x
          Sending lucene/tools/junit4/tests.policy
          Transmitting file data .
          Committed revision 1381046.

          trunk
          Sending lucene/tools/junit4/tests.policy
          Transmitting file data .
          Committed revision 1381047.
          greg@gregslaptop ~/project

          Show
          Greg Bowyer added a comment - Ok the security permission stuff is tightened up for just the internal jvm cache basically it is as follows // Needed for some things in DNS caching in the JVM permission java.security.SecurityPermission "getProperty.networkaddress.cache.ttl" ; permission java.security.SecurityPermission "getProperty.networkaddress.cache.negative.ttl" ; branch_4x Sending lucene/tools/junit4/tests.policy Transmitting file data . Committed revision 1381046. trunk Sending lucene/tools/junit4/tests.policy Transmitting file data . Committed revision 1381047. greg@gregslaptop ~/project
          Hide
          Greg Bowyer added a comment -

          This was done a while back, I should setup something to actually publish the results out

          Show
          Greg Bowyer added a comment - This was done a while back, I should setup something to actually publish the results out
          Greg Bowyer made changes -
          Status Open [ 1 ] Closed [ 6 ]
          Resolution Fixed [ 1 ]

            People

            • Assignee:
              Greg Bowyer
              Reporter:
              Greg Bowyer
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development