Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Invalid
-
None
-
None
Description
The StringComparator now works on serialized data.
To this end new string read/write/copy/compare methods were introduced, which use a variable-length encoding for the characters.
key-points:
- The most significant bits are written/read first.
- The first 2 bits of the character are used to encode the size of the character.
- A character is at most 3 Bytes big.
Additionally, the StringSerializer now has full unicode support. i couldn't find a unicode character that uses more than 22 bits, as such 3 Bytes should be sufficient.
---------------- Imported from GitHub ----------------
Url: https://github.com/stratosphere/stratosphere/pull/801
Created by: zentol
Labels:
Created at: Tue May 13 18:06:22 CEST 2014
State: open