Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: general/build
    • Labels:
    • Lucene Fields:
      New

      Description

      This has been touched on a few times over the years. Having static analysis as part of our build seems like a big win. For example, we could use PMD to look at System.out.println statements like discussed in LUCENE-3877 and we could possibly incorporate the nocommit / @author checks as well.

      There are a few things to work out as part of this:

      • Should we use both PMD and FindBugs or just one of them? They look at code from different perspectives (bytecode vs source code) and target different issues. At the moment I'm in favour of trying both but that might be too heavy handed for our needs.
      • What checks should we use? There's no point having the analysis if it's going to raise too many false-positives or problems we don't deem problematic.
      • How should the analysis be integrated in our build? Need to work out when the analysis should run, how it should be incorporated in Ant and/or Maven, what impact errors should have.
      1. solr-core.html
        382 kB
        Chris Male
      2. LUCENE-3973.patch
        0.9 kB
        Chris Male
      3. LUCENE-3973.patch
        2 kB
        Chris Male
      4. LUCENE-3973.patch
        2 kB
        Chris Male
      5. LUCENE-3973.patch
        3 kB
        Robert Muir
      6. LUCENE-3973.patch
        5 kB
        Chris Male
      7. core.html
        472 kB
        Chris Male

        Issue Links

          Activity

          Hide
          Robert Muir added a comment -

          Should we use both PMD and FindBugs or just one of them?

          +1 for PMD. I'm only looking at the license here.

          Show
          Robert Muir added a comment - Should we use both PMD and FindBugs or just one of them? +1 for PMD. I'm only looking at the license here.
          Hide
          Dawid Weiss added a comment -

          Both are helpful. We use both and I think FindBugs is slightly more useful than PMD but it's just a subjective opinion not anything I measured.

          Also, both can be verbose and a pain in the ass at times when you know the code is right and they still complain... And they are long to execute so they should be part of jenkins nightly/ smoke tests I think, not regular builds (and definitely not ant test...).

          Show
          Dawid Weiss added a comment - Both are helpful. We use both and I think FindBugs is slightly more useful than PMD but it's just a subjective opinion not anything I measured. Also, both can be verbose and a pain in the ass at times when you know the code is right and they still complain... And they are long to execute so they should be part of jenkins nightly/ smoke tests I think, not regular builds (and definitely not ant test...).
          Hide
          Dawid Weiss added a comment -

          There is also this interesting tool: http://babelfish.arc.nasa.gov/trac/jpf

          I haven't used it and I don't know if it can handle Lucene size codebase (the number of execution paths will be astronomic) but if somebody has some time to play with it, it'd be interesting to hear what it can do.

          Show
          Dawid Weiss added a comment - There is also this interesting tool: http://babelfish.arc.nasa.gov/trac/jpf I haven't used it and I don't know if it can handle Lucene size codebase (the number of execution paths will be astronomic) but if somebody has some time to play with it, it'd be interesting to hear what it can do.
          Hide
          Hoss Man added a comment -

          How should the analysis be integrated in our build? Need to work out when the analysis should run, how it should be incorporated in Ant and/or Maven, what impact errors should have.

          i would suggest going about it incrementally...

          • hook into build.xml as optional targets that can be run if you have the neccessary libs installed, don't fail the build just generate the XML report files
          • put the needed libs on builds.apache.org, and hook it into the jenkins nightly target, and configure jenkins to display it's pretty version of the xml reports so people can at least see what's going on.
          • start adding/tweaking custom rule sets in dev-tools to eliminate rules we don't care about, add rules we want that don't exist, or change the severity of rules we think are more/less important
          • tweak the build.xml to fail if anything above some arbitrary severity is tripped
          • worry about maven
          Show
          Hoss Man added a comment - How should the analysis be integrated in our build? Need to work out when the analysis should run, how it should be incorporated in Ant and/or Maven, what impact errors should have. i would suggest going about it incrementally... hook into build.xml as optional targets that can be run if you have the neccessary libs installed, don't fail the build just generate the XML report files put the needed libs on builds.apache.org, and hook it into the jenkins nightly target, and configure jenkins to display it's pretty version of the xml reports so people can at least see what's going on. start adding/tweaking custom rule sets in dev-tools to eliminate rules we don't care about, add rules we want that don't exist, or change the severity of rules we think are more/less important tweak the build.xml to fail if anything above some arbitrary severity is tripped worry about maven
          Hide
          Hoss Man added a comment -
          • start adding/tweaking custom rule sets in dev-tools to eliminate rules we don't care about, add rules we want that don't exist, or change the severity of rules we think are more/less important
          • tweak the build.xml to fail if anything above some arbitrary severity is tripped

          Once upon a time i actually implemented this for PMD in solr's (pre merged trunk) build.xml, but it got bogged down by some more serious problems in our automated builds and forgotten about (the only other issue i remember was the XSLT i was using to pretty-display the PMD results was GPL or something – but these days i would say just let the jenkins PMD plugin render the results, and local devs can read the XML file)

          anyway ... the patch in SOLR-143 might be helpful for the "run lots of checks, but only fail the build for serious ones" idea.

          Show
          Hoss Man added a comment - start adding/tweaking custom rule sets in dev-tools to eliminate rules we don't care about, add rules we want that don't exist, or change the severity of rules we think are more/less important tweak the build.xml to fail if anything above some arbitrary severity is tripped Once upon a time i actually implemented this for PMD in solr's (pre merged trunk) build.xml, but it got bogged down by some more serious problems in our automated builds and forgotten about (the only other issue i remember was the XSLT i was using to pretty-display the PMD results was GPL or something – but these days i would say just let the jenkins PMD plugin render the results, and local devs can read the XML file) anyway ... the patch in SOLR-143 might be helpful for the "run lots of checks, but only fail the build for serious ones" idea.
          Hide
          Dawid Weiss added a comment -

          I believe both pmd and findbugs are on maven repos so one could use ivy to fetch them automatically. One thing less to think about.

          Show
          Dawid Weiss added a comment - I believe both pmd and findbugs are on maven repos so one could use ivy to fetch them automatically. One thing less to think about.
          Hide
          Hoss Man added a comment -

          I believe both pmd and findbugs are on maven repos so one could use ivy to fetch them automatically. One thing less to think about.

          Unless you run into the same taskdef/classloader/sub-build/permgen-OOM problem we had with clover, and the maven-ant-tasks, and ivy that have prevented us from doing the same thing with them.

          Show
          Hoss Man added a comment - I believe both pmd and findbugs are on maven repos so one could use ivy to fetch them automatically. One thing less to think about. Unless you run into the same taskdef/classloader/sub-build/permgen-OOM problem we had with clover, and the maven-ant-tasks, and ivy that have prevented us from doing the same thing with them.
          Hide
          Dawid Weiss added a comment -

          Unless you run into the same taskdef/classloader/sub-build/permgen-OOM

          I was just saying to fetch them via ivy and then spawn a separate jvm to run them, much like you'd do anyway if they are separate installations.

          Besides – we already have an 'ivy warning with instructions', the same can be done with permgen/OOM problems – detect the current (ANT's) VM's settings (can be done via mx bean) and warn/ fail the build if the defaults are too low, instructing the user to set up ANT_OPTS properly...

          I'm not pressing on this, this is a no-issue.

          Show
          Dawid Weiss added a comment - Unless you run into the same taskdef/classloader/sub-build/permgen-OOM I was just saying to fetch them via ivy and then spawn a separate jvm to run them, much like you'd do anyway if they are separate installations. Besides – we already have an 'ivy warning with instructions', the same can be done with permgen/OOM problems – detect the current (ANT's) VM's settings (can be done via mx bean) and warn/ fail the build if the defaults are too low, instructing the user to set up ANT_OPTS properly... I'm not pressing on this, this is a no-issue.
          Hide
          Chris Male added a comment -

          Simple POC patch which adds a pmd target to lucene/common-build.xml. It currently uses ivy:cachepath (but I think I prefer the way Hoss suggests).

          I will attach some example results from analyzing lucene/solr core.

          In our build system, is it possible to run a target for every module? I don't really want to latch this onto compile like clover.

          Still much to do but just showing where things are going.

          Show
          Chris Male added a comment - Simple POC patch which adds a pmd target to lucene/common-build.xml. It currently uses ivy:cachepath (but I think I prefer the way Hoss suggests). I will attach some example results from analyzing lucene/solr core. In our build system, is it possible to run a target for every module? I don't really want to latch this onto compile like clover. Still much to do but just showing where things are going.
          Hide
          Chris Male added a comment -

          HTML results for lucene core (core) and solr core (solr-core).

          Show
          Chris Male added a comment - HTML results for lucene core (core) and solr core (solr-core).
          Hide
          Chris Male added a comment -

          Actually I could probably make this a top-level target and do all modules at once.

          Show
          Chris Male added a comment - Actually I could probably make this a top-level target and do all modules at once.
          Hide
          Chris Male added a comment -

          Patch which runs PMD much like RAT. This uses ivy:cachepath still and I haven't seen any OOMs.

          Since this doesn't have any impact on the build unless someone explicitly runs it, I'd like to commit this and then begin to tweak.

          Show
          Chris Male added a comment - Patch which runs PMD much like RAT. This uses ivy:cachepath still and I haven't seen any OOMs. Since this doesn't have any impact on the build unless someone explicitly runs it, I'd like to commit this and then begin to tweak.
          Hide
          Chris Male added a comment -

          The patch places all the HTML files under lucene/build/pmd. An alternative approach (and what I was doing before) is to have a pmd dir under each module's build dir.

          Show
          Chris Male added a comment - The patch places all the HTML files under lucene/build/pmd. An alternative approach (and what I was doing before) is to have a pmd dir under each module's build dir.
          Hide
          Chris Male added a comment -

          Improved patch which connects the PMD tasks to the ivy checks we have.

          Show
          Chris Male added a comment - Improved patch which connects the PMD tasks to the ivy checks we have.
          Hide
          Robert Muir added a comment -

          The patch places all the HTML files under lucene/build/pmd. An alternative approach (and what I was doing before) is to have a pmd dir under each module's build dir.

          We need to be careful about where these results go: it could affect packaging tasks.

          Show
          Robert Muir added a comment - The patch places all the HTML files under lucene/build/pmd. An alternative approach (and what I was doing before) is to have a pmd dir under each module's build dir. We need to be careful about where these results go: it could affect packaging tasks.
          Hide
          Chris Male added a comment -

          We need to be careful about where these results go: it could affect packaging tasks.

          Good point. Then having a single folder for all the results which is outside of the module build dirs is probably safest. The IDE setups do the same thing.

          Show
          Chris Male added a comment - We need to be careful about where these results go: it could affect packaging tasks. Good point. Then having a single folder for all the results which is outside of the module build dirs is probably safest. The IDE setups do the same thing.
          Hide
          Chris Male added a comment -

          One thing I also notice is that we probably need to exclude generated classes from those examined. They obviously don't contain the nicest looking code and then to pollute the results, especially for analyzers-common.

          Show
          Chris Male added a comment - One thing I also notice is that we probably need to exclude generated classes from those examined. They obviously don't contain the nicest looking code and then to pollute the results, especially for analyzers-common.
          Hide
          Robert Muir added a comment -

          updated patch with 2 tweaks:

          1. explicitly exclude pmd/ from binary dist patterns (just in case!)
          2. only download jars for pmd (not source and javadocs, they are massive)
          Show
          Robert Muir added a comment - updated patch with 2 tweaks: explicitly exclude pmd/ from binary dist patterns (just in case!) only download jars for pmd (not source and javadocs, they are massive)
          Hide
          Robert Muir added a comment -

          Good point. Then having a single folder for all the results which is outside of the module build dirs is probably safest.

          I think its pretty safe really. I just have no idea what PMD produces, and don't want anything sucked in on accident

          I don't think anything would have been included, but better to be paranoid instead.

          Show
          Robert Muir added a comment - Good point. Then having a single folder for all the results which is outside of the module build dirs is probably safest. I think its pretty safe really. I just have no idea what PMD produces, and don't want anything sucked in on accident I don't think anything would have been included, but better to be paranoid instead.
          Hide
          Robert Muir added a comment -

          One thing I also notice is that we probably need to exclude generated classes from those examined. They obviously don't contain the nicest looking code and then to pollute the results, especially for analyzers-common.

          Well, we can worry about that later I guess... you already have a pmd.excludes so if we want to tweak things like
          that we could just define pmd.excludes in analyzers-common or whatever.

          Show
          Robert Muir added a comment - One thing I also notice is that we probably need to exclude generated classes from those examined. They obviously don't contain the nicest looking code and then to pollute the results, especially for analyzers-common. Well, we can worry about that later I guess... you already have a pmd.excludes so if we want to tweak things like that we could just define pmd.excludes in analyzers-common or whatever.
          Hide
          Chris Male added a comment -

          Yeah definitely, thats why I included that option.

          Show
          Chris Male added a comment - Yeah definitely, thats why I included that option.
          Hide
          Chris Male added a comment -

          Updated patch which includes:

          • Moved listing of the rules to their own ruleset file, which will allow us to customize them more and so I don't have to keep making changes to common-build.xml
          • Added full support for solr and top-level.
          Show
          Chris Male added a comment - Updated patch which includes: Moved listing of the rules to their own ruleset file, which will allow us to customize them more and so I don't have to keep making changes to common-build.xml Added full support for solr and top-level.

            People

            • Assignee:
              Unassigned
              Reporter:
              Chris Male
            • Votes:
              2 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:

                Development