Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-15463

Support for creating a dataframe from CSV in Dataset[String]

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.0.0
    • 2.2.0
    • SQL
    • None

    Description

      I currently use Databrick's spark-csv lib but some features don't work with Apache Spark 2.0.0-SNAPSHOT. I understand that with the addition of CSV support into spark-sql directly, that spark-csv won't be modified.
      I currently read some CSV data that has been pre-processed and is in RDD[String] format.
      There is sqlContext.read.json(rdd: RDD[String]) but other formats don't appear to support the creation of DataFrames based on loading from RDD[String].

      Attachments

        Issue Links

          Activity

            People

              hyukjin.kwon Hyukjin Kwon
              pj.fanning PJ Fanning
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: