Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17131

Code generation fails when running SQL expressions against a wide dataset (thousands of columns)

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 2.0.0
    • Fix Version/s: None
    • Component/s: SQL
    • Labels:
      None

      Description

      When reading the CSV file that contains 1776 columns Spark and Janino fail to generate the code with message:

      Constant pool has grown past JVM limit of 0xFFFF
      

      When running a common select with all columns it's fine:

            val allCols = df.columns.map(c => col(c).as(c + "_alias"))
            val newDf = df.select(allCols: _*)
            newDf.show()
      

      But when I invoke the describe method:

      newDf.describe(allCols: _*)
      

      it fails with the following stack trace:

      	at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:889)
      	at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:941)
      	at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:938)
      	at org.spark_project.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
      	at org.spark_project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
      	... 30 more
      Caused by: org.codehaus.janino.JaninoRuntimeException: Constant pool has grown past JVM limit of 0xFFFF
      	at org.codehaus.janino.util.ClassFile.addToConstantPool(ClassFile.java:402)
      	at org.codehaus.janino.util.ClassFile.addConstantIntegerInfo(ClassFile.java:300)
      	at org.codehaus.janino.UnitCompiler.addConstantIntegerInfo(UnitCompiler.java:10307)
      	at org.codehaus.janino.UnitCompiler.pushConstant(UnitCompiler.java:8868)
      	at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:4346)
      	at org.codehaus.janino.UnitCompiler.access$7100(UnitCompiler.java:185)
      	at org.codehaus.janino.UnitCompiler$10.visitIntegerLiteral(UnitCompiler.java:3265)
      	at org.codehaus.janino.Java$IntegerLiteral.accept(Java.java:4321)
      	at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3290)
      	at org.codehaus.janino.UnitCompiler.fakeCompile(UnitCompiler.java:2605)
      	at org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:4362)
      	at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:3975)
      	at org.codehaus.janino.UnitCompiler.access$6900(UnitCompiler.java:185)
      	at org.codehaus.janino.UnitCompiler$10.visitMethodInvocation(UnitCompiler.java:3263)
      	at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:3974)
      	at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3290)
      	at org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:4368)
      	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2662)
      	at org.codehaus.janino.UnitCompiler.access$4400(UnitCompiler.java:185)
      	at org.codehaus.janino.UnitCompiler$7.visitMethodInvocation(UnitCompiler.java:2627)
      	at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:3974)
      	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:2654)
      	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:1643)
      ....
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                zyoma Iaroslav Zeigerman
              • Votes:
                1 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: