Description
Code generation for very wide datasets can fail because of the Constant Pool limit reached.
This can be caused by many reasons. One of them is that we are currently splitting the definition of the generated methods among several NestedClass but all these methods are called in the main class. Since we have entries added to the constant pool for each method invocation, this is limiting the number of rows and is leading for very wide dataset to:
org.codehaus.janino.JaninoRuntimeException: Constant pool for class org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificMutableProjection has grown past JVM limit of 0xFFFF
Attachments
Issue Links
- is duplicated by
-
SPARK-22761 64KB JVM bytecode limit problem with GLM
- Closed
- relates to
-
SPARK-18016 Code Generation: Constant Pool Past Limit for Wide/Nested Dataset
- Resolved
- links to