Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-6999

infinite recursion with createDataFrame(JavaRDD[Row], java.util.List[String])

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 1.3.0
    • Fix Version/s: 1.4.0
    • Component/s: SQL
    • Labels:
      None

      Description

      It looks like

        def createDataFrame(rowRDD: JavaRDD[Row], columns: java.util.List[String]): DataFrame = {
          createDataFrame(rowRDD.rdd, columns.toSeq)
        }
      

      is in fact an infinite recursion because it calls itself. Scala implicit conversions convert the arguments back into a JavaRDD and a java.util.List.

      15/04/19 16:51:24 INFO BlockManagerMaster: Trying to register BlockManager
      15/04/19 16:51:24 INFO BlockManagerMasterActor: Registering block manager localhost:53711 with 1966.1 MB RAM, BlockManagerId(<driver>, localhost, 53711)
      15/04/19 16:51:24 INFO BlockManagerMaster: Registered BlockManager
      Exception in thread "main" java.lang.StackOverflowError
          at scala.collection.mutable.AbstractSeq.<init>(Seq.scala:47)
          at scala.collection.mutable.AbstractBuffer.<init>(Buffer.scala:48)
          at scala.collection.convert.Wrappers$JListWrapper.<init>(Wrappers.scala:84)
          at scala.collection.convert.WrapAsScala$class.asScalaBuffer(WrapAsScala.scala:127)
          at scala.collection.JavaConversions$.asScalaBuffer(JavaConversions.scala:53)
          at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:408)
          at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:408)
          at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:408)
          at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:408)
      

      Here is the code sample I used to reproduce the issue:

      /**
       * @author juang
       */
      public final class InfiniteRecursionExample {
      
          public static void main(String[] args) {
              JavaSparkContext sc = new JavaSparkContext("local", "infinite_recursion_example");
              List<Row> rows = Lists.newArrayList();
              JavaRDD<Row> rowRDD = sc.parallelize(rows);
      
              SQLContext sqlContext = new SQLContext(sc);
              sqlContext.createDataFrame(rowRDD, ImmutableList.of("myCol"));
          }
      
      }
      

        Attachments

          Activity

            People

            • Assignee:
              chenghao Cheng Hao
              Reporter:
              justin.uang Justin Uang
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: