Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.5.0
-
None
-
None
Description
I'd like to work with non-UTF8 data easily.
Suppose I have data in latin1. Currently, doing a "select *" will return the upper ascii characters in '\xef\xbf\xbd', which is the replacement character '\ufffd' encoded in UTF-8. Would be nice for Hive to understand different encodings, or to have a concept of byte string.