Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-8443

GenerateMutableProjection Exceeds JVM Code Size Limits

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.4.0
    • 1.5.0
    • SQL
    • None

    Description

      GenerateMutableProjection put all expressions columns into a single apply function. When there are a lot of columns, the apply function code size exceeds the 64kb limit, which is a hard limit on jvm that cannot change.

      This comes up when we were aggregating about 100 columns using codegen and unsafe feature.

      I wrote an unit test that reproduces this issue.
      https://github.com/saurfang/spark/blob/codegen_size_limit/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala

      This test currently fails at 2048 expressions. It seems the master is more tolerant than branch-1.4 about this because code is more concise.

      While the code on master has changed since branch-1.4, I am able to reproduce the problem in master. For now I hacked my way in branch-1.4 to workaround this problem by wrapping each expression with a separate function then call those functions sequentially in apply. The proper way is probably check the length of the projectCode and break it up as necessary. (This seems to be easier in master actually since we are building code by string rather than quasiquote)

      Let me know if anyone has additional thoughts on this, I'm happy to contribute a pull request.

      Attaching stack trace produced by unit test

      [info] - code size limit *** FAILED *** (7 seconds, 103 milliseconds)
      [info]   com.google.common.util.concurrent.UncheckedExecutionException: org.codehaus.janino.JaninoRuntimeException: Code of method "(Ljava/lang/Object;)Ljava/lang/Object;" of class "SC$SpecificProjection" grows beyond 64 KB
      [info]   at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2263)
      [info]   at com.google.common.cache.LocalCache.get(LocalCache.java:4000)
      [info]   at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004)
      [info]   at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
      [info]   at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:285)
      [info]   at org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2$$anonfun$apply$mcV$sp$2.apply(CodeGenerationSuite.scala:50)
      [info]   at org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2$$anonfun$apply$mcV$sp$2.apply(CodeGenerationSuite.scala:48)
      [info]   at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:144)
      [info]   at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:144)
      [info]   at scala.collection.immutable.Range.foreach(Range.scala:141)
      [info]   at scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:144)
      [info]   at scala.collection.AbstractTraversable.foldLeft(Traversable.scala:105)
      [info]   at org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2.apply$mcV$sp(CodeGenerationSuite.scala:47)
      [info]   at org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2.apply(CodeGenerationSuite.scala:47)
      [info]   at org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2.apply(CodeGenerationSuite.scala:47)
      [info]   at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
      [info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
      [info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
      [info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
      [info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
      [info]   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
      [info]   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:42)
      [info]   at org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
      [info]   at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
      [info]   at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
      [info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
      [info]   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
      [info]   at org.scalatest.FunSuite.runTest(FunSuite.scala:1555)
      [info]   at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
      [info]   at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
      [info]   at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
      [info]   at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
      [info]   at scala.collection.immutable.List.foreach(List.scala:318)
      [info]   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
      [info]   at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
      [info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
      [info]   at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
      [info]   at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
      [info]   at org.scalatest.Suite$class.run(Suite.scala:1424)
      [info]   at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
      [info]   at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
      [info]   at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
      [info]   at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
      [info]   at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
      [info]   at org.scalatest.FunSuite.run(FunSuite.scala:1555)
      [info]   at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:462)
      [info]   at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:671)
      [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:294)
      [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:284)
      [info]   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      [info]   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      [info]   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      [info]   at java.lang.Thread.run(Thread.java:745)
      [info]   Cause: org.codehaus.janino.JaninoRuntimeException: Code of method "(Ljava/lang/Object;)Ljava/lang/Object;" of class "SC$SpecificProjection" grows beyond 64 KB
      [info]   at org.codehaus.janino.CodeContext.makeSpace(CodeContext.java:941)
      [info]   at org.codehaus.janino.CodeContext.write(CodeContext.java:874)
      [info]   at org.codehaus.janino.CodeContext.writeBranch(CodeContext.java:965)
      [info]   at org.codehaus.janino.UnitCompiler.writeBranch(UnitCompiler.java:10261)
      [info]   at org.codehaus.janino.UnitCompiler.compileBoolean2(UnitCompiler.java:2862)
      [info]   at org.codehaus.janino.UnitCompiler.access$4800(UnitCompiler.java:185)
      [info]   at org.codehaus.janino.UnitCompiler$8.visitAmbiguousName(UnitCompiler.java:2832)
      [info]   at org.codehaus.janino.Java$AmbiguousName.accept(Java.java:3138)
      [info]   at org.codehaus.janino.UnitCompiler.compileBoolean(UnitCompiler.java:2842)
      [info]   at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:1741)
      [info]   at org.codehaus.janino.UnitCompiler.access$1200(UnitCompiler.java:185)
      [info]   at org.codehaus.janino.UnitCompiler$4.visitIfStatement(UnitCompiler.java:937)
      [info]   at org.codehaus.janino.Java$IfStatement.accept(Java.java:2157)
      [info]   at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:958)
      [info]   at org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1007)
      [info]   at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:2293)
      [info]   at org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:822)
      [info]   at org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:794)
      [info]   at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:507)
      [info]   at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:658)
      [info]   at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:662)
      [info]   at org.codehaus.janino.UnitCompiler.access$600(UnitCompiler.java:185)
      [info]   at org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:350)
      [info]   at org.codehaus.janino.Java$MemberClassDeclaration.accept(Java.java:1035)
      [info]   at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:354)
      [info]   at org.codehaus.janino.UnitCompiler.compileDeclaredMemberTypes(UnitCompiler.java:769)
      [info]   at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:532)
      [info]   at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:393)
      [info]   at org.codehaus.janino.UnitCompiler.access$400(UnitCompiler.java:185)
      [info]   at org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:347)
      [info]   at org.codehaus.janino.Java$PackageMemberClassDeclaration.accept(Java.java:1139)
      [info]   at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:354)
      [info]   at org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:322)
      [info]   at org.codehaus.janino.SimpleCompiler.compileToClassLoader(SimpleCompiler.java:383)
      [info]   at org.codehaus.janino.ClassBodyEvaluator.compileToClass(ClassBodyEvaluator.java:315)
      [info]   at org.codehaus.janino.ClassBodyEvaluator.cook(ClassBodyEvaluator.java:233)
      [info]   at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:192)
      [info]   at org.codehaus.commons.compiler.Cookable.cook(Cookable.java:84)
      [info]   at org.codehaus.commons.compiler.Cookable.cook(Cookable.java:77)
      [info]   at org.codehaus.janino.ClassBodyEvaluator.<init>(ClassBodyEvaluator.java:72)
      [info]   at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.compile(CodeGenerator.scala:245)
      [info]   at org.apache.spark.sql.catalyst.expressions.codegen.GenerateMutableProjection$.create(GenerateMutableProjection.scala:87)
      [info]   at org.apache.spark.sql.catalyst.expressions.codegen.GenerateMutableProjection$.create(GenerateMutableProjection.scala:29)
      [info]   at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:272)
      [info]   at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
      [info]   at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
      [info]   at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
      [info]   at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2257)
      [info]   at com.google.common.cache.LocalCache.get(LocalCache.java:4000)
      [info]   at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004)
      [info]   at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
      [info]   at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:285)
      [info]   at org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2$$anonfun$apply$mcV$sp$2.apply(CodeGenerationSuite.scala:50)
      [info]   at org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2$$anonfun$apply$mcV$sp$2.apply(CodeGenerationSuite.scala:48)
      [info]   at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:144)
      [info]   at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:144)
      [info]   at scala.collection.immutable.Range.foreach(Range.scala:141)
      [info]   at scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:144)
      [info]   at scala.collection.AbstractTraversable.foldLeft(Traversable.scala:105)
      [info]   at org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2.apply$mcV$sp(CodeGenerationSuite.scala:47)
      [info]   at org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2.apply(CodeGenerationSuite.scala:47)
      [info]   at org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2.apply(CodeGenerationSuite.scala:47)
      [info]   at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
      [info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
      [info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
      [info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
      [info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
      [info]   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
      [info]   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:42)
      [info]   at org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
      [info]   at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
      [info]   at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
      [info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
      [info]   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
      [info]   at org.scalatest.FunSuite.runTest(FunSuite.scala:1555)
      [info]   at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
      [info]   at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
      [info]   at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
      [info]   at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
      [info]   at scala.collection.immutable.List.foreach(List.scala:318)
      [info]   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
      [info]   at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
      [info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
      [info]   at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
      [info]   at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
      [info]   at org.scalatest.Suite$class.run(Suite.scala:1424)
      [info]   at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
      [info]   at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
      [info]   at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
      [info]   at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
      [info]   at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
      [info]   at org.scalatest.FunSuite.run(FunSuite.scala:1555)
      [info]   at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:462)
      [info]   at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:671)
      [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:294)
      [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:284)
      [info]   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      [info]   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      [info]   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      [info]   at java.lang.Thread.run(Thread.java:745)
      

      Attachments

        Issue Links

          Activity

            People

              saurfang Sen Fang
              saurfang Sen Fang
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: