Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
0.5.0
-
None
-
None
Description
If I want to preserve data in columns which contains a newline (webcrawling for instance) I cannot set the ESCAPED BY clause to escape these out (other characters such as commas escape fine, however). This may be due to the line terminators, which are locked to be newlines, are picked up first, and then fields processed.
This seems to be related to:
"SerDe should escape some special characters"
https://issues.apache.org/jira/browse/HIVE-136
and
"Implement "LINES TERMINATED BY""
https://issues.apache.org/jira/browse/HIVE-302
where at comment: https://issues.apache.org/jira/browse/HIVE-302?focusedCommentId=12793435&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12793435
"This is not fixable currently because the line terminator is determined by LineRecordReader.LineReader which is in the Hadoop land."
Attachments
Issue Links
- is related to
-
HIVE-11785 Support escaping carriage return and new line for LazySimpleSerDe
- Closed