Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21684

df.write double escaping all the already escaped characters except the first one

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 2.2.0
    • None
    • SQL

    Description

      Hi,

      If we have a dataframe with the column value as

       ab\,cd\,ef\,gh 

      Then while writing it is being written as

       "ab\,cd\\,ef\\,gh" 

      i.e it double escapes all the already escaped commas/delimiters but not the first one.
      This is weird behaviour considering either it should do for all or none.
      If I do mention df.option("escape","") as empty then it solves this problem but the double quotes inside the same value if any are preceded by a special char i.e '\u00'. Why does it do so when the escape character is set as ""(empty)?

      Attachments

        1. SparkQuotesTest2.scala
          0.8 kB
          Taran Saini

        Issue Links

          Activity

            People

              Unassigned Unassigned
              taransaini43 Taran Saini
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: