Uploaded image for project: 'Apache Rat'
  1. Apache Rat
  2. RAT-162

CDDL1License.matches slow with large inputs

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.10
    • 0.11
    • None
    • None

    Description

      mvn apache-rat:check runs slowly with large files. I accidentally had a 100 MB log file which took over a minute to for RAT to parse. The stack trace included:

      "main" prio=10 tid=0x00007f322800a000 nid=0x6730 runnable [0x00007f3230235000]
         java.lang.Thread.State: RUNNABLE
              at java.util.regex.Pattern$Curly.match0(Pattern.java:4166)
              at java.util.regex.Pattern$Curly.match(Pattern.java:4132)
              at java.util.regex.Pattern$Start.match(Pattern.java:3408)
              at java.util.regex.Matcher.search(Matcher.java:1199)
              at java.util.regex.Matcher.find(Matcher.java:592)
              at org.apache.rat.analysis.license.CDDL1License.matches(CDDL1License.java:65)
              at org.apache.rat.analysis.license.SimplePatternBasedLicense.match(SimplePatternBasedLicense.java:69)
      

      I attached a patch which caches the Patterns in CDDL1License which works around this issue.

      Attachments

        1. RAT-162.patch
          2 kB
          Andrew Gaul

        Issue Links

          Activity

            People

              pottlinger Philipp Ottlinger
              gaul Andrew Gaul
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: