Uploaded image for project: 'Commons CSV'
  1. Commons CSV
  2. CSV-294

CSVFormat does not support explicit " as escape char

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.9.0
    • 1.12.0
    • None
    • None

    Description

      Reading data that contains " does not work if escape character is manually set to '"' as specified in RFC 4180.

      It works for other escape characters or if no escape character is explicitly defined in the format.

      This line in Lexer.java is responsible for the originally quite erroneous ticket:

      this.escape = mapNullToDisabled(format.getEscapeCharacter());

      From this line I (wrongly) deduced that an unspecified escape character would actually disable escaping. Because of that I wanted to enable it by setting it to '"' which causes exceptions in the Lexer for perfectly valid input. That in turn convinced my that this is a way bigger issue than it is. Sorry about that.

      I don't think that the current situation is ideal, though.

      I would not have been this confused if CSVFormat would be more explicit about the escape char that will be used, i.e. if toString() would show the implicitly used quote character or print - in case of null - that this means it's using the quote character. It is currently omitted from the output if it is not set explicitly.

      There is also no documentation about what null as escape character actually means - it may be documented somewhere but isn't documented for CSVFormat.getEscapeCharacter() or CSVFormat.Builder.set/getEscape() methods.

      And setting the escape character explicitly to the value specified in the RFC should certainly not fail, even if setting it to that value is superfluous since null behaves exactly the same.

      Relevant part of the RFC:

      7. If double-quotes are used to enclose fields, then a double-quote
      appearing inside a field must be escaped by preceding it with
      another double quote. For example:

      "aaa","b""bb","ccc"

      Related issue:

      https://issues.apache.org/jira/browse/CSV-150

      Attachments

        1. JiraCsv294Test.java
          3 kB
          Joern Huxhorn

        Activity

          People

            Unassigned Unassigned
            huxi Joern Huxhorn
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: