Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.7.0
-
None
Description
CSV export doesn't conform to RFC-4180: exported csv is broken in some cases
RFC-4180:
If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote.
It makes CSV with double quotes (") exported from Zeppelin not importable by any tools, including Excel.
It looks like CSV export has other issues too, like in some cases exported column value was a negative number instead of a character field. It could be a new bug or related again to the fact that Zeppelin CSV exported doesn't conform to RFC-4180 standard.
Some related quotes from RFC-4180
https://tools.ietf.org/html/rfc4180 :
5. Each field may or may not be enclosed in double quotes (however some programs, such as Microsoft Excel, do not use double quotes at all). If fields are not enclosed with double quotes, then double quotes may not appear inside the fields. For example: "aaa","bbb","ccc" CRLF zzz,yyy,xxx 6. Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes. For example: "aaa","b CRLF bb","ccc" CRLF zzz,yyy,xxx 7. If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote. For example: "aaa","b""bb","ccc"
Attachments
Issue Links
- Blocked
-
ZEPPELIN-2956 Downloaded CSV/TSV data will get unexpected division when the column value contains both delimiter and quotation mark.
- Open
- is related to
-
ZEPPELIN-3511 remove old button "Download Data as CSV/TSV"
- In Progress