Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.5.0, 1.6.0, 1.7.0, 1.8.0
-
None
Description
Thrift doesn't have true BINARY type. The BINARY type is actually just an unencoded STRING. Quoted from Thrift Types section of official Thrift documentation:
binary: a sequence of unencoded bytes
N.B.: This is currently a specialized form of the string type above, added to provide better interoperability with Java. The current plan-of-record is to elevate this to a base type at some point.
The consequence is that, Thrift BINARY and STRING are both passed to parquet-thrift as STRING, and are always encoded as BINARY (UTF8).
This is really a problem on Thrift side. One possible workaround is to inspect binary fields in the actual generated Java classes to see whether the type is ByteBuffer.
Attachments
Issue Links
- links to