Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23563

make the size fo cache in CodeGenerator configable

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Incomplete
    • 2.3.0
    • None
    • SQL

    Description

      the cache in class org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator has a hard cod maxmunSize 100, current code is:

       

      // scala
      private val cache = CacheBuilder.newBuilder()
        .maximumSize(100)
        .build(
          new CacheLoader[CodeAndComment, (GeneratedClass, Int)]() {
            override def load(code: CodeAndComment): (GeneratedClass, Int) = {
              val startTime = System.nanoTime()
              val result = doCompile(code)
              val endTime = System.nanoTime()
              def timeMs: Double = (endTime - startTime).toDouble / 1000000
              CodegenMetrics.METRIC_SOURCE_CODE_SIZE.update(code.body.length)
              CodegenMetrics.METRIC_COMPILATION_TIME.update(timeMs.toLong)
              logInfo(s"Code generated in $timeMs ms")
              result
            }
          })
      

       In some specific situation, for example: a long term and spark tasks are unchanged,  the size of cache maximumSize configuration is a better idea.

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            passionke kejiqing
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: