[SPARK-17131] Code generation fails when running SQL expressions against a wide dataset (thousands of columns) - ASF JIRA

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: 2.0.0
Fix Version/s: None
Component/s: SQL
Labels:
None

Description

When reading the CSV file that contains 1776 columns Spark and Janino fail to generate the code with message:

Constant pool has grown past JVM limit of 0xFFFF

When running a common select with all columns it's fine:

      val allCols = df.columns.map(c => col(c).as(c + "_alias"))
      val newDf = df.select(allCols: _*)
      newDf.show()

But when I invoke the describe method:

newDf.describe(allCols: _*)

it fails with the following stack trace:

	at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:889)
	at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:941)
	at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:938)
	at org.spark_project.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
	at org.spark_project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
	... 30 more
Caused by: org.codehaus.janino.JaninoRuntimeException: Constant pool has grown past JVM limit of 0xFFFF
	at org.codehaus.janino.util.ClassFile.addToConstantPool(ClassFile.java:402)
	at org.codehaus.janino.util.ClassFile.addConstantIntegerInfo(ClassFile.java:300)
	at org.codehaus.janino.UnitCompiler.addConstantIntegerInfo(UnitCompiler.java:10307)
	at org.codehaus.janino.UnitCompiler.pushConstant(UnitCompiler.java:8868)
	at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:4346)
	at org.codehaus.janino.UnitCompiler.access$7100(UnitCompiler.java:185)
	at org.codehaus.janino.UnitCompiler$10.visitIntegerLiteral(UnitCompiler.java:3265)
	at org.codehaus.janino.Java$IntegerLiteral.accept(Java.java:4321)
	at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3290)
	at org.codehaus.janino.UnitCompiler.fakeCompile(UnitCompiler.java:2605)
	at org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:4362)
	at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:3975)
	at org.codehaus.janino.UnitCompiler.access$6900(UnitCompiler.java:185)
	at org.codehaus.janino.UnitCompiler$10.visitMethodInvocation(UnitCompiler.java:3263)
	at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:3974)
	at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3290)
	at org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:4368)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2662)
	at org.codehaus.janino.UnitCompiler.access$4400(UnitCompiler.java:185)
	at org.codehaus.janino.UnitCompiler$7.visitMethodInvocation(UnitCompiler.java:2627)
	at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:3974)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:2654)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:1643)
....

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

_SPARK_17131__add_a_test_case_with_1000_column_DF_where_describe___fails.patch
06/Oct/16 19:12
2 kB
Andrey Melentyev

Issue Links

duplicates

SPARK-16845 org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

Resolved

is related to

SPARK-18016 Code Generation: Constant Pool Past Limit for Wide/Nested Dataset

Resolved

Activity

Descending order - Click to sort in ascending order

Aleksander Eskilson added a comment - 20/Oct/16 14:43 - edited

Yeah, that makes sense. So far, what I documented and this one seem to have been the only JIRAs that exhibit specifically the Constant Pool limit error. I'm trying to dig deeper into it to see if it really marks its own class of error, but given that ~~SPARK-17702~~ didn't resolve the error case I posted (even though it splits up sections of large generated code), I do suspect they are, quite related, but ultimately different issues. I think the splitExpressions technique that was used in ~~SPARK-17702~~ and that also appears to be being employed in ~~SPARK-16845~~ could be useful for the range of different classes that can generate too many lines of code. Seeing the issues linked together is definitely useful.

To that end, I'll leave mine resolved as a duplicate of ~~SPARK-16845~~ for now until I can make use of the patch it develops, so we can see more conclusively if they're related issues, or truly duplicates. And I'll link the two "0xFFFF" issues together as related.

Aleksander Eskilson added a comment - 20/Oct/16 14:43 - edited Yeah, that makes sense. So far, what I documented and this one seem to have been the only JIRAs that exhibit specifically the Constant Pool limit error. I'm trying to dig deeper into it to see if it really marks its own class of error, but given that SPARK-17702 didn't resolve the error case I posted (even though it splits up sections of large generated code), I do suspect they are, quite related, but ultimately different issues. I think the splitExpressions technique that was used in SPARK-17702 and that also appears to be being employed in SPARK-16845 could be useful for the range of different classes that can generate too many lines of code. Seeing the issues linked together is definitely useful. To that end, I'll leave mine resolved as a duplicate of SPARK-16845 for now until I can make use of the patch it develops, so we can see more conclusively if they're related issues, or truly duplicates. And I'll link the two "0xFFFF" issues together as related.

Sean R. Owen added a comment - 20/Oct/16 14:26

OK well I think it's fine to leave one copy of the "0xFFFF" issue open if you have any reasonable reason to suspect it's different, and just link the JIRAs. I suppose I was mostly saying this could just be reopened, and separately, there are a lot of real duplicates of similar issues out there too, making it hard to figure out what the underlying unique issues are.

Sean R. Owen added a comment - 20/Oct/16 14:26 OK well I think it's fine to leave one copy of the "0xFFFF" issue open if you have any reasonable reason to suspect it's different, and just link the JIRAs. I suppose I was mostly saying this could just be reopened, and separately, there are a lot of real duplicates of similar issues out there too, making it hard to figure out what the underlying unique issues are.

Aleksander Eskilson added a comment - 20/Oct/16 13:24

Sure, I apologize for that. I'll also mark it as a duplicate of ~~SPARK-16845~~ and monitor its pull-request to see if it resolves the issue I opened.

Aleksander Eskilson added a comment - 20/Oct/16 13:24 Sure, I apologize for that. I'll also mark it as a duplicate of SPARK-16845 and monitor its pull-request to see if it resolves the issue I opened.

Sean R. Owen added a comment - 20/Oct/16 07:24

It may or may not be, though again I suspect a common cause with one of several JIRAs. The point here is to join potentially related discussion without conflating issues. I don't think it's useful to just make another JIRA vs reopening this one, but, this seems to be a losing battle.

Sean R. Owen added a comment - 20/Oct/16 07:24 It may or may not be, though again I suspect a common cause with one of several JIRAs. The point here is to join potentially related discussion without conflating issues. I don't think it's useful to just make another JIRA vs reopening this one, but, this seems to be a losing battle.

Aleksander Eskilson added a comment - 19/Oct/16 22:12

sowen, melentye
I'm not so certain this error is the same as ~~SPARK-16845~~. It seems like there have been several classes of errors all related to the sizes of individual methods growing beyond the 64 KB limit (~~SPARK-16845~~, ~~SPARK-17702~~). I think this one is of a different class of error, or at least

Constant pool has grown past JVM limit of 0xFFFF

marks a different class of error. I was able to produce similar to the one first documented when trying to encode a Java object with a very wide and deeply nested schema. I've gone ahead and created a bug report for that, ~~SPARK-18016~~, and in its description I've attached a small project that can reproduce the error.

Aleksander Eskilson added a comment - 19/Oct/16 22:12 sowen , melentye I'm not so certain this error is the same as SPARK-16845 . It seems like there have been several classes of errors all related to the sizes of individual methods growing beyond the 64 KB limit ( SPARK-16845 , SPARK-17702 ). I think this one is of a different class of error, or at least Constant pool has grown past JVM limit of 0xFFFF marks a different class of error. I was able to produce similar to the one first documented when trying to encode a Java object with a very wide and deeply nested schema. I've gone ahead and created a bug report for that, SPARK-18016 , and in its description I've attached a small project that can reproduce the error.

Andrey Melentyev added a comment - 06/Oct/16 20:31

Looks similar to https://issues.apache.org/jira/browse/SPARK-17217 btw

Andrey Melentyev added a comment - 06/Oct/16 20:31 Looks similar to https://issues.apache.org/jira/browse/SPARK-17217 btw

Andrey Melentyev added a comment - 06/Oct/16 19:50

I tried wrapping the attached test into "withSQLConf(SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "false")" - still fails in a nasty way, printing the content of the 300K LOC generated class in seemingly endless loop. Running the code from spark-shell with --conf spark.sql.codegen.wholeStage=false, fails as well.

Andrey Melentyev added a comment - 06/Oct/16 19:50 I tried wrapping the attached test into "withSQLConf(SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "false")" - still fails in a nasty way, printing the content of the 300K LOC generated class in seemingly endless loop. Running the code from spark-shell with --conf spark.sql.codegen.wholeStage=false, fails as well.

Sean R. Owen added a comment - 06/Oct/16 19:35

Yeah I'm not 100% sure, though I strongly suspect a common cause. If it ends up being different we can reopen this. I though ti might be more productive to tie them together until it's clear they're not the same, but I don't mind much either way, whatever is most helpful.

Can you try disabling whole stage codegen to see if that works around it?

Sean R. Owen added a comment - 06/Oct/16 19:35 Yeah I'm not 100% sure, though I strongly suspect a common cause. If it ends up being different we can reopen this. I though ti might be more productive to tie them together until it's clear they're not the same, but I don't mind much either way, whatever is most helpful. Can you try disabling whole stage codegen to see if that works around it?

Andrey Melentyev added a comment - 06/Oct/16 19:30 - edited

srowen are you sure it's a dup of ~~SPARK-16845~~? The exceptions are a bit different, this one has

Caused by: org.codehaus.janino.JaninoRuntimeException: Constant pool for class org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificMutableProjection has grown past JVM limit of 0xFFFF

while ~~SPARK-16845~~ says

Caused by: org.codehaus.janino.JaninoRuntimeException: Code of method "(Lorg/apache/spark/sql/catalyst/InternalRow;Lorg/apache/spark/sql/catalyst/InternalRow;)I" of class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

both are about something growing too large in a generated class source code though.

Andrey Melentyev added a comment - 06/Oct/16 19:30 - edited srowen are you sure it's a dup of SPARK-16845 ? The exceptions are a bit different, this one has Caused by: org.codehaus.janino.JaninoRuntimeException: Constant pool for class org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificMutableProjection has grown past JVM limit of 0xFFFF while SPARK-16845 says Caused by: org.codehaus.janino.JaninoRuntimeException: Code of method "(Lorg/apache/spark/sql/catalyst/InternalRow;Lorg/apache/spark/sql/catalyst/InternalRow;)I" of class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB both are about something growing too large in a generated class source code though.

Sean R. Owen added a comment - 06/Oct/16 19:25

Thanks melentye , let's roll this into the existing JIRA.

Sean R. Owen added a comment - 06/Oct/16 19:25 Thanks melentye , let's roll this into the existing JIRA.

Andrey Melentyev added a comment - 06/Oct/16 19:12

Patch for org.apache.spark.sql.DataFrameSuite with a test case reproducing the problem

Andrey Melentyev added a comment - 06/Oct/16 19:12 Patch for org.apache.spark.sql.DataFrameSuite with a test case reproducing the problem

Aris Vlasakakis added a comment - 22/Sep/16 18:25

Hi there,

I discovered a bug, and it also pertains to code generation with many columns – although in my case the bugs within Janino code generation in Catalyst start after several hundred columns. Are these somehow related?

My bug report was merged into this one: https://issues.apache.org/jira/browse/SPARK-16845

Aris Vlasakakis added a comment - 22/Sep/16 18:25 Hi there, I discovered a bug, and it also pertains to code generation with many columns – although in my case the bugs within Janino code generation in Catalyst start after several hundred columns. Are these somehow related? My bug report was merged into this one: https://issues.apache.org/jira/browse/SPARK-16845

Iaroslav Zeigerman added a comment - 18/Aug/16 16:56

Having a different exception when trying to apply mean function to all columns:

val allCols = df.columns.map(c => mean(c))
val newDf = df.select(allCols: _*)
newDf.show()

java.io.EOFException
	at java.io.DataInputStream.readFully(DataInputStream.java:197)
	at java.io.DataInputStream.readFully(DataInputStream.java:169)
	at org.codehaus.janino.util.ClassFile.loadAttribute(ClassFile.java:1383)
	at org.codehaus.janino.util.ClassFile.loadAttributes(ClassFile.java:555)
	at org.codehaus.janino.util.ClassFile.loadFields(ClassFile.java:518)
	at org.codehaus.janino.util.ClassFile.<init>(ClassFile.java:185)
	at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anonfun$recordCompilationStats$1.apply(CodeGenerator.scala:914)
	at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anonfun$recordCompilationStats$1.apply(CodeGenerator.scala:912)
	at scala.collection.Iterator$class.foreach(Iterator.scala:742)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
	at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
	at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.recordCompilationStats(CodeGenerator.scala:912)
	at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:884)
	at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:941)
	at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:938)
	at org.spark_project.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
...

Iaroslav Zeigerman added a comment - 18/Aug/16 16:56 Having a different exception when trying to apply mean function to all columns: val allCols = df.columns.map(c => mean(c)) val newDf = df.select(allCols: _*) newDf.show() java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:197) at java.io.DataInputStream.readFully(DataInputStream.java:169) at org.codehaus.janino.util.ClassFile.loadAttribute(ClassFile.java:1383) at org.codehaus.janino.util.ClassFile.loadAttributes(ClassFile.java:555) at org.codehaus.janino.util.ClassFile.loadFields(ClassFile.java:518) at org.codehaus.janino.util.ClassFile.<init>(ClassFile.java:185) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anonfun$recordCompilationStats$1.apply(CodeGenerator.scala:914) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anonfun$recordCompilationStats$1.apply(CodeGenerator.scala:912) at scala.collection.Iterator$class.foreach(Iterator.scala:742) at scala.collection.AbstractIterator.foreach(Iterator.scala:1194) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.recordCompilationStats(CodeGenerator.scala:912) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:884) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:941) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:938) at org.spark_project.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599) ...

People

Assignee:: Unassigned

Reporter:: Iaroslav Zeigerman

Votes:: 1 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 18/Aug/16 15:04

Updated:: 20/Oct/16 14:45

Resolved:: 06/Oct/16 19:25