Description
It looks like
def createDataFrame(rowRDD: JavaRDD[Row], columns: java.util.List[String]): DataFrame = {
createDataFrame(rowRDD.rdd, columns.toSeq)
}
is in fact an infinite recursion because it calls itself. Scala implicit conversions convert the arguments back into a JavaRDD and a java.util.List.
15/04/19 16:51:24 INFO BlockManagerMaster: Trying to register BlockManager 15/04/19 16:51:24 INFO BlockManagerMasterActor: Registering block manager localhost:53711 with 1966.1 MB RAM, BlockManagerId(<driver>, localhost, 53711) 15/04/19 16:51:24 INFO BlockManagerMaster: Registered BlockManager Exception in thread "main" java.lang.StackOverflowError at scala.collection.mutable.AbstractSeq.<init>(Seq.scala:47) at scala.collection.mutable.AbstractBuffer.<init>(Buffer.scala:48) at scala.collection.convert.Wrappers$JListWrapper.<init>(Wrappers.scala:84) at scala.collection.convert.WrapAsScala$class.asScalaBuffer(WrapAsScala.scala:127) at scala.collection.JavaConversions$.asScalaBuffer(JavaConversions.scala:53) at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:408) at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:408) at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:408) at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:408)
Here is the code sample I used to reproduce the issue:
/** * @author juang */ public final class InfiniteRecursionExample { public static void main(String[] args) { JavaSparkContext sc = new JavaSparkContext("local", "infinite_recursion_example"); List<Row> rows = Lists.newArrayList(); JavaRDD<Row> rowRDD = sc.parallelize(rows); SQLContext sqlContext = new SQLContext(sc); sqlContext.createDataFrame(rowRDD, ImmutableList.of("myCol")); } }