[HIVE-1505] Support non-UTF8 data - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 0.5.0
Fix Version/s: None
Component/s: Serializers/Deserializers
Labels:
None

Description

I'd like to work with non-UTF8 data easily.

Suppose I have data in latin1. Currently, doing a "select *" will return the upper ascii characters in '\xef\xbf\xbd', which is the replacement character '\ufffd' encoded in UTF-8. Would be nice for Hive to understand different encodings, or to have a concept of byte string.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

trunk-encoding.patch
19/Aug/10 06:18
7 kB
Ted Xu

Activity

People

Assignee:: Ted Xu

Reporter:: bc Wong

Votes:: 3 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 02/Aug/10 16:00

Updated:: 25/Aug/10 18:35