Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Incomplete
-
2.3.0
-
None
Description
the cache in class org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator has a hard cod maxmunSize 100, current code is:
// scala private val cache = CacheBuilder.newBuilder() .maximumSize(100) .build( new CacheLoader[CodeAndComment, (GeneratedClass, Int)]() { override def load(code: CodeAndComment): (GeneratedClass, Int) = { val startTime = System.nanoTime() val result = doCompile(code) val endTime = System.nanoTime() def timeMs: Double = (endTime - startTime).toDouble / 1000000 CodegenMetrics.METRIC_SOURCE_CODE_SIZE.update(code.body.length) CodegenMetrics.METRIC_COMPILATION_TIME.update(timeMs.toLong) logInfo(s"Code generated in $timeMs ms") result } })
In some specific situation, for example: a long term and spark tasks are unchanged, the size of cache maximumSize configuration is a better idea.