Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-29437

CSV Writer should escape 'escapechar' when it exists in the data

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Trivial
    • Resolution: Not A Problem
    • Affects Version/s: 2.4.3
    • Fix Version/s: None
    • Component/s: Input/Output
    • Labels:
      None

      Description

      When the data contains escape character (default '\') it should either be escaped or quoted.

      Steps to reproduce: https://gist.github.com/kretes/58f7f66a0780681a44c175a2ac3c0da2

       

      The effect can be either bad data read or sometimes even unable to properly read the csv, e.g. when escape character is the last character in the column - it break the column reading for that row and effectively break e.g. type inference for a dataframe

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              kretes Tomasz Bartczak
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: