Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-25660

Impossible to use the backward slash as the CSV fields delimiter

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.4.0
    • Fix Version/s: 2.4.0
    • Component/s: SQL
    • Labels:
      None

      Description

      If fields in CSV input are delimited by '\', for example:

      123\4\5\1\Q\\P\P\2321213\1\\\P\\F
      

      reading it by the code:

      df = spark.read.format('csv').option("header","false").options(delimiter='\\').load("file:///file.csv")
      

      causes the exception:

      String index out of range: 1
      java.lang.StringIndexOutOfBoundsException: String index out of range: 1
      	at java.lang.String.charAt(String.java:658)
      	at org.apache.spark.sql.execution.datasources.csv.CSVUtils$.toChar(CSVUtils.scala:101)
      	at org.apache.spark.sql.execution.datasources.csv.CSVOptions.<init>(CSVOptions.scala:86)
      	at org.apache.spark.sql.execution.datasources.csv.CSVOptions.<init>(CSVOptions.scala:41)
      	at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:488)
      

        Attachments

          Activity

            People

            • Assignee:
              maxgekk Maxim Gekk
              Reporter:
              maxgekk Maxim Gekk
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: