Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-15473

CSV fails to write and read back empty dataframe

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Cannot Reproduce
    • 2.0.0
    • None
    • SQL
    • None

    Description

      Currently CSV data source fails to write and read empty data.

      The code below:

      val emptyDf = spark.range(10).filter(_ => false)
      emptyDf.write
        .format("csv")
        .save(path.getCanonicalPath)
      
      val copyEmptyDf = spark.read
        .format("csv")
        .load(path.getCanonicalPath)
      
      copyEmptyDf.show()
      

      throws an exception below:

      Can not create a Path from an empty string
      java.lang.IllegalArgumentException: Can not create a Path from an empty string
      	at org.apache.hadoop.fs.Path.checkPathArg(Path.java:127)
      	at org.apache.hadoop.fs.Path.<init>(Path.java:135)
      	at org.apache.hadoop.util.StringUtils.stringToPath(StringUtils.java:241)
      	at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362)
      	at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:987)
      	at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:987)
      	at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:178)
      	at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:178)
      	at scala.Option.map(Option.scala:146)
      

      Note that this is a different case with the data below

      val emptyDf = spark.createDataFrame(spark.sparkContext.emptyRDD[Row], schema)
      

      In this case, any writer is not initialised and created. (no calls of WriterContainer.writeRows().

      Maybe, it should be able to read/write header for schemas as well as empty data.

      For Parquet and JSON, it works but CSV does not.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              gurwls223 Hyukjin Kwon
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: