Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
Description
The OpenCSVSerde produces a CSV with all its columns quoted
no matter of they type or if the string columns contain a separator or not.
The problem is some readers (such postgresql) are not compatible with
such CSV, in particular when bulk loading them thought COPY statement.
I propose a new CsvSerde, based on a Univocity Parser (wich is used by Apache Spark)
that has been described a 2 times faster thant OpenCSV. https://github.com/uniVocity/csv-parsers-comparison . This new CsvSerde whould only quote columns when needed.
Regards,
Attachments
Issue Links
- links to