I have 2 wide dataframes that contain nested data structures, when I explode one of the dataframes, it doesn't include records with an empty nested structure (outer explode not supported). So, I create a similar dataframe with null values and union them together. See
SPARK-13721 for more details as to why I have to do this.
I was hoping that
SPARK-16845 was going to address my issue, but it does not. I was asked by lwlin to open this JIRA.
I will attach a code snippet that can be pasted into spark-shell that duplicates my code and the exception. This worked just fine in Spark 1.6.x.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 35 in stage 5.0 failed 4 times, most recent failure: Lost task 35.3 in stage 5.0 (TID 812, somehost.mydomain.com, executor 8): java.util.concurrent.ExecutionException: java.lang.Exception: failed to compile: org.codehaus.janino.JaninoRuntimeException: Code of method "apply(Lorg/apache/spark/sql/catalyst/InternalRow;)Lorg/apache/spark/sql/catalyst/expressions/UnsafeRow;" of class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection" grows beyond 64 KB