Uploaded image for project: 'Crunch'
  1. Crunch
  2. CRUNCH-564

Add support for using escape character same as open/close quote character

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Trivial
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.14.0
    • Component/s: Core
    • Labels:

      Description

      As a user I would like to use CSVInputFormat to handle the CSV files following this RFC http://www.ietf.org/rfc/rfc4180.txt.

      Many developers use Apache StringEscapeUtils.escapeCsv( ) method to escape their CSVs. The method escapes the CSV following the RFC4180.

      https://commons.apache.org/proper/commons-lang/javadocs/api-2.6/org/apache/commons/lang/StringEscapeUtils.html

      The CSVLineReader throws exception in such a case. We can enhance the code to support the CSVs that use escape same as the quote characters.

      https://github.com/apache/crunch/blob/master/crunch-core/src/main/java/org/apache/crunch/io/text/csv/CSVLineReader.java#L152

      I would appreciate a comment, if someone has knowingly rejected the idea due to some technical limitation or a problem with allowing escape and quote as same characters. By the way Apache HAWQ seem to get around this issue somehow and reads such CSVs alright.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                champgm mac champion
                Reporter:
                aliiqbal Muhammad
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: