Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Parrot parser parses large source code is much slower than the antlr2 parser because of the error alternative for better prompt[1]. In order to reduce the whole parsing time, we could parse all the source code in parallel.
Use compiler option parallelParse or JVM option groovy.parallel.parse to enable/disable the improvement. For Groovy 3, the improvement is disabled by default, but for Groovy 4+, the improvement will be enabled.
P.S.
- Antlr2 can prompt missing right parethesis smartly without any error alternative, but antlr4 can not.
- Parsing all groovy source code of nextflow[2]Â sequentially costs 64s on my machine. If parallelParse is enabled, just costs 43s, about 33% time reduced. Here is the script to measure the time costed:
import groovy.io.FileType import org.codehaus.groovy.ast.ModuleNode import org.codehaus.groovy.control.CompilationUnit import org.codehaus.groovy.control.CompilerConfiguration import org.codehaus.groovy.control.Phases def parse(boolean parallelParse) { def sourceFileList = [] new File('./src').eachFileRecurse (FileType.FILES) { file -> if (!file.name.endsWith('.groovy') && !file.name.endsWith('.gradle')) return sourceFileList << file } long elapsedTimeMillis new CompilationUnit(new CompilerConfiguration(optimizationOptions: [parallelParse: parallelParse])).tap { sourceFileList.each { f -> addSource f } def b = System.currentTimeMillis() compile Phases.CONVERSION def e = System.currentTimeMillis() elapsedTimeMillis = e - b } return elapsedTimeMillis } // def t = parse(false) // no parallel, costs 64s def t = parse(true) // in parallel, costs 43s println "# ${t / 1000}s elapsed"
Â
[1] https://github.com/apache/groovy/blob/GROOVY_3_0_4/src/antlr/GroovyParser.g4#L1259-L1261
[2] https://github.com/nextflow-io/nextflow