Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-6116 DataFrame API improvement umbrella ticket (Spark 1.5)
  3. SPARK-6293

SQLContext.implicits should provide automatic conversion for RDD[Row]

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: 1.3.0
    • Fix Version/s: None
    • Component/s: SQL
    • Labels:
      None
    • Target Version/s:
    • Sprint:
      Spark 1.5 doc/QA sprint

      Description

      When a DataFrame is converted to an RDD[Row], it should be easier to convert it back to a DataFrame via toDF. E.g.:

      val df: DataFrame = myRDD.toDF("col1", "col2")  // This works for types like RDD[scala.Tuple2[...]]
      
      val splits = df.rdd.randomSplit(...)
      
      val split0: RDD[Row] = splits(0)
      
      val df0 = split0.toDF("col1", "col2") // This fails
      

      The failure happens because SQLContext.implicits does not provide an automatic conversion for Rows. (It does handle Products, but Row does not implement Product.)

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              josephkb Joseph K. Bradley
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: