Description
I guess somewhere along the way Spark uses codehaus to generate optimized code, but if it fails to do so, it falls back to an alternative way. Here's a log string that I see when executing one command on dataframes:
17/05/02 12:00:14 ERROR CodeGenerator: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 93, Column 13: ')' expected instead of 'type'
...
/* 088 */ private double loadFactor = 0.5;
/* 089 */ private int numBuckets = (int) (capacity / loadFactor);
/* 090 */ private int maxSteps = 2;
/* 091 */ private int numRows = 0;
/* 092 */ private org.apache.spark.sql.types.StructType keySchema = new org.apache.spark.sql.types.StructType().add("taxonomyPayload", org.apache.spark.sql.types.DataTypes.StringType)
/* 093 */ .add("
", org.apache.spark.sql.types.DataTypes.StringType)
/* 094 */ .add("spatialPayload", org.apache.spark.sql.types.DataTypes.StringType);
/* 095 */ private org.apache.spark.sql.types.StructType valueSchema = new org.apache.spark.sql.types.StructType().add("sum", org.apache.spark.sql.types.DataTypes.DoubleType);
/* 096 */ private Object emptyVBase;
/* 097 */ private long emptyVOff;
/* 098 */ private int emptyVLen;
/* 099 */ private boolean isBatchFull = false;
/* 100 */
It looks like on line 93 it failed to escape that string (that happened to be in my code). I'm not sure how critical this is, but seems like there's escaping missing somewhere.
Stack trace that happens afterwards: https://pastebin.com/NmgTfwN0
Attachments
Issue Links
- duplicates
-
SPARK-18952 regex strings not properly escaped in codegen for aggregations
- Resolved