Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-42237

change binary to unsupported dataType in csv format

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.4.8, 3.3.1
    • 3.4.0
    • SQL
    • None

    Description

      When a binary colunm is written into csv files, actual content of this colunm is object.toString(), which is meaningless.

      val df = Seq(Array[Byte](1,2)).toDF
      df.write.csv("/Users/guowei/Desktop/binary_csv")
      

      The csv file's content is as follows:

      Meanwhile, if a binary colunm saved as table with csv fileformat, the table can't be read back successfully.

      val df = Seq((1, Array[Byte](1,2))).toDF
      df.write.format("csv").saveAsTable("binaryDataTable")spark.sql("select * from binaryDataTable").show()
      

      So I think it' better to change binary to unsupported dataType in csv format, both for datasource v1(CSVFileFormat) and v2(CSVTable).

      Attachments

        Activity

          People

            Wayne Guo Wei Guo
            Wayne Guo Wei Guo
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: